The stock market has consistently proven to be a good place to invest in and save for the future. There are a lot of compelling reasons to invest in stocks. It can help in fighting inflation, create wealth, and also provides some tax benefits. Good steady returns on investments over a long period of time can also grow a lot more than seems possible. Also, thanks to the power of compound interest, the earlier one starts investing, the larger the corpus one can have for retirement. Overall, investing in stocks can help meet life's financial aspirations.
It is important to maintain a diversified portfolio when investing in stocks in order to maximise earnings under any market condition. Having a diversified portfolio tends to yield higher returns and face lower risk by tempering potential losses when the market is down. It is often easy to get lost in a sea of financial metrics to analyze while determining the worth of a stock, and doing the same for a multitude of stocks to identify the right picks for an individual can be a tedious task. By doing a cluster analysis, one can identify stocks that exhibit similar characteristics and ones which exhibit minimum correlation. This will help investors better analyze stocks across different market segments and help protect against risks that could make the portfolio vulnerable to losses.
Trade&Ahead is a financial consultancy firm who provide their customers with personalized investment strategies. They have hired you as a Data Scientist and provided you with data comprising stock price and some financial indicators for a few companies listed under the New York Stock Exchange. They have assigned you the tasks of analyzing the data, grouping the stocks based on the attributes provided, and sharing insights about the characteristics of each group.
# Installing the libraries with the specified version.
# uncomment and run the following line if Google Colab is being used
#!pip install scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.3 yellowbrick==1.5 -q --user
# Import libraries we need to do our work
# Read and manipulate data
import numpy as np
import pandas as pd
# HTML styling
from IPython.display import display, HTML
# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns
# Statistics
from scipy.stats import shapiro
# Suppress warnings
import warnings
warnings.filterwarnings("ignore")
# Installing the libraries with the specified version.
# uncomment and run the following lines if Jupyter Notebook is being used
#!pip install scikit-learn==1.2.2 seaborn==0.13.1 matplotlib==3.7.1 numpy==1.25.2 pandas==1.5.2 yellowbrick==1.5 -q --user
#!pip install --upgrade -q jinja2
# Load Google Colab
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
# Define path to data file
path = r'/content/drive/MyDrive/Learning/Data Coursework/PGP-DSBA/7-Unsupervised Learning/Project 7/'
df = pd.read_csv(path + 'stock_data.csv')
# Iterate through each column and determine its data type
# Calculate number of rows and columns
num_rows, num_cols = df.shape
# Initialize dictionary to store counts of each data type
data_type_counts = {}
# Iterate through each column and determine its data type
for col in df.columns:
data_type = str(df[col].dtype)
if data_type in data_type_counts:
data_type_counts[data_type] += 1
else:
data_type_counts[data_type] = 1
# # Check size of dataset and print summary information
print(f'The data has {num_rows} rows and {num_cols} columns total:')
print()
for dtype, count in data_type_counts.items():
print(f'{count} columns of type {dtype}')
The data has 340 rows and 15 columns total: 4 columns of type object 7 columns of type float64 4 columns of type int64
# View information
print('Training')
display(df.info())
Training <class 'pandas.core.frame.DataFrame'> RangeIndex: 340 entries, 0 to 339 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Ticker Symbol 340 non-null object 1 Security 340 non-null object 2 GICS Sector 340 non-null object 3 GICS Sub Industry 340 non-null object 4 Current Price 340 non-null float64 5 Price Change 340 non-null float64 6 Volatility 340 non-null float64 7 ROE 340 non-null int64 8 Cash Ratio 340 non-null int64 9 Net Cash Flow 340 non-null int64 10 Net Income 340 non-null int64 11 Earnings Per Share 340 non-null float64 12 Estimated Shares Outstanding 340 non-null float64 13 P/E Ratio 340 non-null float64 14 P/B Ratio 340 non-null float64 dtypes: float64(7), int64(4), object(4) memory usage: 40.0+ KB
None
# Describe the data. Can also use include=all to also show categorical series.
df.describe(include='all').T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ticker Symbol | 340 | 340 | AAL | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Security | 340 | 340 | American Airlines Group | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sector | 340 | 11 | Industrials | 53 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sub Industry | 340 | 104 | Oil & Gas Exploration & Production | 16 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Current Price | 340.0 | NaN | NaN | NaN | 80.862345 | 98.055086 | 4.5 | 38.555 | 59.705 | 92.880001 | 1274.949951 |
| Price Change | 340.0 | NaN | NaN | NaN | 4.078194 | 12.006338 | -47.129693 | -0.939484 | 4.819505 | 10.695493 | 55.051683 |
| Volatility | 340.0 | NaN | NaN | NaN | 1.525976 | 0.591798 | 0.733163 | 1.134878 | 1.385593 | 1.695549 | 4.580042 |
| ROE | 340.0 | NaN | NaN | NaN | 39.597059 | 96.547538 | 1.0 | 9.75 | 15.0 | 27.0 | 917.0 |
| Cash Ratio | 340.0 | NaN | NaN | NaN | 70.023529 | 90.421331 | 0.0 | 18.0 | 47.0 | 99.0 | 958.0 |
| Net Cash Flow | 340.0 | NaN | NaN | NaN | 55537620.588235 | 1946365312.175789 | -11208000000.0 | -193906500.0 | 2098000.0 | 169810750.0 | 20764000000.0 |
| Net Income | 340.0 | NaN | NaN | NaN | 1494384602.941176 | 3940150279.327937 | -23528000000.0 | 352301250.0 | 707336000.0 | 1899000000.0 | 24442000000.0 |
| Earnings Per Share | 340.0 | NaN | NaN | NaN | 2.776662 | 6.587779 | -61.2 | 1.5575 | 2.895 | 4.62 | 50.09 |
| Estimated Shares Outstanding | 340.0 | NaN | NaN | NaN | 577028337.75403 | 845849595.417695 | 27672156.86 | 158848216.1 | 309675137.8 | 573117457.325 | 6159292035.0 |
| P/E Ratio | 340.0 | NaN | NaN | NaN | 32.612563 | 44.348731 | 2.935451 | 15.044653 | 20.819876 | 31.764755 | 528.039074 |
| P/B Ratio | 340.0 | NaN | NaN | NaN | -1.718249 | 13.966912 | -76.119077 | -4.352056 | -1.06717 | 3.917066 | 129.064585 |
# Check for empty values
print(f'Dataset has {df.isnull().sum().sum()} null values:',)
# Replace any missing values with NaNs
df = df.replace("?", np.nan)
Dataset has 0 null values:
# Retrieve and preview the categorial column values, using a list.
df_eda = df.copy()
# Create numerical and categorical column lists
list_catcol = df_eda.select_dtypes(include=['object']).columns.to_list()
list_numcol = df_eda.select_dtypes(exclude=['object']).columns.to_list()
print('Dataset has', len(list_catcol), 'categorical columns and', len(list_numcol), 'numerical columns.')
Dataset has 4 categorical columns and 11 numerical columns.
# Use a for loop to show the counts and percent contribution of each numerical column.
for col in list_numcol:
print(f'***** {col} *****')
print('Preview counts for:',df_eda[col].value_counts(),'\n')
print('Preview % breakdown for:',df_eda[col].value_counts(normalize=True),'\n')
***** Current Price *****
Preview counts for: Current Price
54.360001 2
114.379997 2
43.230000 2
32.930000 2
103.889999 1
..
51.070000 1
52.130001 1
81.589996 1
302.399994 1
47.919998 1
Name: count, Length: 336, dtype: int64
Preview % breakdown for: Current Price
54.360001 0.005882
114.379997 0.005882
43.230000 0.005882
32.930000 0.005882
103.889999 0.002941
...
51.070000 0.002941
52.130001 0.002941
81.589996 0.002941
302.399994 0.002941
47.919998 0.002941
Name: proportion, Length: 336, dtype: float64
***** Price Change *****
Preview counts for: Price Change
9.999995 1
6.237855 1
6.585554 1
5.796884 1
9.529966 1
..
8.037605 1
10.019650 1
-4.078593 1
3.654238 1
16.678836 1
Name: count, Length: 340, dtype: int64
Preview % breakdown for: Price Change
9.999995 0.002941
6.237855 0.002941
6.585554 0.002941
5.796884 0.002941
9.529966 0.002941
...
8.037605 0.002941
10.019650 0.002941
-4.078593 0.002941
3.654238 0.002941
16.678836 0.002941
Name: proportion, Length: 340, dtype: float64
***** Volatility *****
Preview counts for: Volatility
1.687151 1
1.023375 1
1.460619 1
1.281566 1
2.168814 1
..
1.056186 1
1.308082 1
1.941104 1
1.404508 1
1.610285 1
Name: count, Length: 340, dtype: int64
Preview % breakdown for: Volatility
1.687151 0.002941
1.023375 0.002941
1.460619 0.002941
1.281566 0.002941
2.168814 0.002941
...
1.056186 0.002941
1.308082 0.002941
1.941104 0.002941
1.404508 0.002941
1.610285 0.002941
Name: proportion, Length: 340, dtype: float64
***** ROE *****
Preview counts for: ROE
10 24
9 16
14 16
11 16
17 15
..
582 1
116 1
68 1
63 1
142 1
Name: count, Length: 82, dtype: int64
Preview % breakdown for: ROE
10 0.070588
9 0.047059
14 0.047059
11 0.047059
17 0.044118
...
582 0.002941
116 0.002941
68 0.002941
63 0.002941
142 0.002941
Name: proportion, Length: 82, dtype: float64
***** Cash Ratio *****
Preview counts for: Cash Ratio
99 37
47 19
25 10
70 8
10 7
..
82 1
128 1
175 1
164 1
100 1
Name: count, Length: 124, dtype: int64
Preview % breakdown for: Cash Ratio
99 0.108824
47 0.055882
25 0.029412
70 0.023529
10 0.020588
...
82 0.002941
128 0.002941
175 0.002941
164 0.002941
100 0.002941
Name: proportion, Length: 124, dtype: float64
***** Net Cash Flow *****
Preview counts for: Net Cash Flow
159000000 2
39000000 2
160383000 2
-8000000 2
23000000 2
..
4073000 1
-14756000 1
523803000 1
2196000 1
-43623000 1
Name: count, Length: 332, dtype: int64
Preview % breakdown for: Net Cash Flow
159000000 0.005882
39000000 0.005882
160383000 0.005882
-8000000 0.005882
23000000 0.005882
...
4073000 0.002941
-14756000 0.002941
523803000 0.002941
2196000 0.002941
-43623000 0.002941
Name: proportion, Length: 332, dtype: float64
***** Net Income *****
Preview counts for: Net Income
1899000000 2
355000000 2
848000000 2
1034000000 2
636056000 2
..
268000000 1
232120000 1
878485000 1
85171000 1
339000000 1
Name: count, Length: 335, dtype: int64
Preview % breakdown for: Net Income
1899000000 0.005882
355000000 0.005882
848000000 0.005882
1034000000 0.005882
636056000 0.005882
...
268000000 0.002941
232120000 0.002941
878485000 0.002941
85171000 0.002941
339000000 0.002941
Name: proportion, Length: 335, dtype: float64
***** Earnings Per Share *****
Preview counts for: Earnings Per Share
3.28 5
3.71 4
3.03 4
1.80 4
1.58 4
..
5.87 1
1.86 1
1.77 1
1.31 1
1.20 1
Name: count, Length: 268, dtype: int64
Preview % breakdown for: Earnings Per Share
3.28 0.014706
3.71 0.011765
3.03 0.011765
1.80 0.011765
1.58 0.011765
...
5.87 0.002941
1.86 0.002941
1.77 0.002941
1.31 0.002941
1.20 0.002941
Name: proportion, Length: 268, dtype: float64
***** Estimated Shares Outstanding *****
Preview counts for: Estimated Shares Outstanding
1.992378e+08 4
5.139877e+08 3
1.115226e+08 3
1.469542e+09 2
1.633016e+09 2
..
3.671392e+08 1
5.777662e+07 1
5.457799e+08 1
1.485114e+08 1
4.985294e+08 1
Name: count, Length: 323, dtype: int64
Preview % breakdown for: Estimated Shares Outstanding
1.992378e+08 0.011765
5.139877e+08 0.008824
1.115226e+08 0.008824
1.469542e+09 0.005882
1.633016e+09 0.005882
...
3.671392e+08 0.002941
5.777662e+07 0.002941
5.457799e+08 0.002941
1.485114e+08 0.002941
4.985294e+08 0.002941
Name: proportion, Length: 323, dtype: float64
***** P/E Ratio *****
Preview counts for: P/E Ratio
93.089287 17
20.819876 6
24.070121 5
28.407929 3
17.313076 3
..
18.636363 1
21.175469 1
55.829113 1
7.575269 1
70.470585 1
Name: count, Length: 295, dtype: int64
Preview % breakdown for: P/E Ratio
93.089287 0.050000
20.819876 0.017647
24.070121 0.014706
28.407929 0.008824
17.313076 0.008824
...
18.636363 0.002941
21.175469 0.002941
55.829113 0.002941
7.575269 0.002941
70.470585 0.002941
Name: proportion, Length: 295, dtype: float64
***** P/B Ratio *****
Preview counts for: P/B Ratio
-1.269332 6
-3.089477 5
-3.973395 4
3.954975 4
-4.327138 4
..
-14.151445 1
5.108756 1
4.427425 1
5.884467 1
1.723068 1
Name: count, Length: 299, dtype: int64
Preview % breakdown for: P/B Ratio
-1.269332 0.017647
-3.089477 0.014706
-3.973395 0.011765
3.954975 0.011765
-4.327138 0.011765
...
-14.151445 0.002941
5.108756 0.002941
4.427425 0.002941
5.884467 0.002941
1.723068 0.002941
Name: proportion, Length: 299, dtype: float64
# Use a for loop to show the counts and percent contribution of each categorical column.
for col in list_catcol:
print(f'***** {col} *****')
print('Preview counts for:',df_eda[col].value_counts(),'\n')
print('Preview % breakdown for:',df_eda[col].value_counts(normalize=True),'\n')
***** Ticker Symbol *****
Preview counts for: Ticker Symbol
AAL 1
NEE 1
NUE 1
NTRS 1
NSC 1
..
EQR 1
EQIX 1
EOG 1
EMN 1
ZTS 1
Name: count, Length: 340, dtype: int64
Preview % breakdown for: Ticker Symbol
AAL 0.002941
NEE 0.002941
NUE 0.002941
NTRS 0.002941
NSC 0.002941
...
EQR 0.002941
EQIX 0.002941
EOG 0.002941
EMN 0.002941
ZTS 0.002941
Name: proportion, Length: 340, dtype: float64
***** Security *****
Preview counts for: Security
American Airlines Group 1
NextEra Energy 1
Nucor Corp. 1
Northern Trust Corp. 1
Norfolk Southern Corp. 1
..
Equity Residential 1
Equinix 1
EOG Resources 1
Eastman Chemical 1
Zoetis 1
Name: count, Length: 340, dtype: int64
Preview % breakdown for: Security
American Airlines Group 0.002941
NextEra Energy 0.002941
Nucor Corp. 0.002941
Northern Trust Corp. 0.002941
Norfolk Southern Corp. 0.002941
...
Equity Residential 0.002941
Equinix 0.002941
EOG Resources 0.002941
Eastman Chemical 0.002941
Zoetis 0.002941
Name: proportion, Length: 340, dtype: float64
***** GICS Sector *****
Preview counts for: GICS Sector
Industrials 53
Financials 49
Health Care 40
Consumer Discretionary 40
Information Technology 33
Energy 30
Real Estate 27
Utilities 24
Materials 20
Consumer Staples 19
Telecommunications Services 5
Name: count, dtype: int64
Preview % breakdown for: GICS Sector
Industrials 0.155882
Financials 0.144118
Health Care 0.117647
Consumer Discretionary 0.117647
Information Technology 0.097059
Energy 0.088235
Real Estate 0.079412
Utilities 0.070588
Materials 0.058824
Consumer Staples 0.055882
Telecommunications Services 0.014706
Name: proportion, dtype: float64
***** GICS Sub Industry *****
Preview counts for: GICS Sub Industry
Oil & Gas Exploration & Production 16
REITs 14
Industrial Conglomerates 14
Electric Utilities 12
Internet Software & Services 12
..
Technology Hardware, Storage & Peripherals 1
Real Estate Services 1
Trucking 1
Networking Equipment 1
Casinos & Gaming 1
Name: count, Length: 104, dtype: int64
Preview % breakdown for: GICS Sub Industry
Oil & Gas Exploration & Production 0.047059
REITs 0.041176
Industrial Conglomerates 0.041176
Electric Utilities 0.035294
Internet Software & Services 0.035294
...
Technology Hardware, Storage & Peripherals 0.002941
Real Estate Services 0.002941
Trucking 0.002941
Networking Equipment 0.002941
Casinos & Gaming 0.002941
Name: proportion, Length: 104, dtype: float64
# Check counts, unique counts and duplicate values for each column. Based on this, we can decide how to approach each variable.
for col in df_eda:
unique_count = df_eda[col].nunique()
duplicate_count = df_eda[col].count()-df_eda[col].nunique()
empty_count = df_eda[col].isnull().sum().sum()
print(f'Series {col} has {unique_count} unique values, {duplicate_count} duplicate values, and {empty_count} missing values.')
Series Ticker Symbol has 340 unique values, 0 duplicate values, and 0 missing values. Series Security has 340 unique values, 0 duplicate values, and 0 missing values. Series GICS Sector has 11 unique values, 329 duplicate values, and 0 missing values. Series GICS Sub Industry has 104 unique values, 236 duplicate values, and 0 missing values. Series Current Price has 336 unique values, 4 duplicate values, and 0 missing values. Series Price Change has 340 unique values, 0 duplicate values, and 0 missing values. Series Volatility has 340 unique values, 0 duplicate values, and 0 missing values. Series ROE has 82 unique values, 258 duplicate values, and 0 missing values. Series Cash Ratio has 124 unique values, 216 duplicate values, and 0 missing values. Series Net Cash Flow has 332 unique values, 8 duplicate values, and 0 missing values. Series Net Income has 335 unique values, 5 duplicate values, and 0 missing values. Series Earnings Per Share has 268 unique values, 72 duplicate values, and 0 missing values. Series Estimated Shares Outstanding has 323 unique values, 17 duplicate values, and 0 missing values. Series P/E Ratio has 295 unique values, 45 duplicate values, and 0 missing values. Series P/B Ratio has 299 unique values, 41 duplicate values, and 0 missing values.
# Chart high-level assessment for each numerical series
for col in list_numcol:
mean = df[col].mean()
median = df[col].median()
min = df[col].min()
max = df[col].max()
this_range = max-min
print('*****')
print('For', col, ':')
print(' * Mean =', mean)
print(' * Median =' , median)
print(' * Range =', min, 'to' , max,'(', round(this_range,3), ')')
***** For Current Price : * Mean = 80.86234477216765 * Median = 59.705 * Range = 4.5 to 1274.949951 ( 1270.45 ) ***** For Price Change : * Mean = 4.078193812467648 * Median = 4.819504729 * Range = -47.12969338 to 55.05168339 ( 102.181 ) ***** For Volatility : * Mean = 1.5259762538411765 * Median = 1.3855929255000001 * Range = 0.733163184 to 4.58004173 ( 3.847 ) ***** For ROE : * Mean = 39.59705882352941 * Median = 15.0 * Range = 1 to 917 ( 916 ) ***** For Cash Ratio : * Mean = 70.02352941176471 * Median = 47.0 * Range = 0 to 958 ( 958 ) ***** For Net Cash Flow : * Mean = 55537620.5882353 * Median = 2098000.0 * Range = -11208000000 to 20764000000 ( 31972000000 ) ***** For Net Income : * Mean = 1494384602.9411764 * Median = 707336000.0 * Range = -23528000000 to 24442000000 ( 47970000000 ) ***** For Earnings Per Share : * Mean = 2.776661764705882 * Median = 2.895 * Range = -61.2 to 50.09 ( 111.29 ) ***** For Estimated Shares Outstanding : * Mean = 577028337.7540295 * Median = 309675137.79999995 * Range = 27672156.86 to 6159292035.0 ( 6131619878.14 ) ***** For P/E Ratio : * Mean = 32.612563318264705 * Median = 20.81987609 * Range = 2.935450768 to 528.0390742 ( 525.104 ) ***** For P/B Ratio : * Mean = -1.7182493633029414 * Median = -1.0671703205 * Range = -76.11907749 to 129.0645854 ( 205.184 )
# Check for null values
print(f'EDA dataset has {df_eda.isnull().sum().sum()} null values:',)
# Replace any missing values with NaNs
df_eda = df_eda.replace("?", np.nan)
# Count NaN values
nan_count = df_eda.isnull().sum().sum()
if nan_count > 0:
print(f"EDA DataFrame has {nan_count} NaN values.")
else:
print("EDA DataFrame does not have NaN values.")
EDA dataset has 0 null values: EDA DataFrame does not have NaN values.
# Preview rows with missing data
df_eda[df_eda.isnull().any(axis=1)]
| Ticker Symbol | Security | GICS Sector | GICS Sub Industry | Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio |
|---|
# Preview a random sample of the data
df_eda.sample(n=10, random_state=1)
| Ticker Symbol | Security | GICS Sector | GICS Sub Industry | Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 102 | DVN | Devon Energy Corp. | Energy | Oil & Gas Exploration & Production | 32.000000 | -15.478079 | 2.923698 | 205 | 70 | 830000000 | -14454000000 | -35.55 | 4.065823e+08 | 93.089287 | 1.785616 |
| 125 | FB | Information Technology | Internet Software & Services | 104.660004 | 16.224320 | 1.320606 | 8 | 958 | 592000000 | 3669000000 | 1.31 | 2.800763e+09 | 79.893133 | 5.884467 | |
| 11 | AIV | Apartment Investment & Mgmt | Real Estate | REITs | 40.029999 | 7.578608 | 1.163334 | 15 | 47 | 21818000 | 248710000 | 1.52 | 1.636250e+08 | 26.335526 | -1.269332 |
| 248 | PG | Procter & Gamble | Consumer Staples | Personal Products | 79.410004 | 10.660538 | 0.806056 | 17 | 129 | 160383000 | 636056000 | 3.28 | 4.913916e+08 | 24.070121 | -2.256747 |
| 238 | OXY | Occidental Petroleum | Energy | Oil & Gas Exploration & Production | 67.610001 | 0.865287 | 1.589520 | 32 | 64 | -588000000 | -7829000000 | -10.23 | 7.652981e+08 | 93.089287 | 3.345102 |
| 336 | YUM | Yum! Brands Inc | Consumer Discretionary | Restaurants | 52.516175 | -8.698917 | 1.478877 | 142 | 27 | 159000000 | 1293000000 | 2.97 | 4.353535e+08 | 17.682214 | -3.838260 |
| 112 | EQT | EQT Corporation | Energy | Oil & Gas Exploration & Production | 52.130001 | -21.253771 | 2.364883 | 2 | 201 | 523803000 | 85171000 | 0.56 | 1.520911e+08 | 93.089287 | 9.567952 |
| 147 | HAL | Halliburton Co. | Energy | Oil & Gas Equipment & Services | 34.040001 | -5.101751 | 1.966062 | 4 | 189 | 7786000000 | -671000000 | -0.79 | 8.493671e+08 | 93.089287 | 17.345857 |
| 89 | DFS | Discover Financial Services | Financials | Consumer Finance | 53.619999 | 3.653584 | 1.159897 | 20 | 99 | 2288000000 | 2297000000 | 5.14 | 4.468872e+08 | 10.431906 | -0.375934 |
| 173 | IVZ | Invesco Ltd. | Financials | Asset Management & Custody Banks | 33.480000 | 7.067477 | 1.580839 | 12 | 67 | 412000000 | 968100000 | 2.26 | 4.283628e+08 | 14.814159 | 4.218620 |
OBSERVATIONS
The raw dataset has 340 rows and 15 columns total:
4 columns of type object
7 columns of type float64
4 columns of type int64
Some duplicate data exists. We'll need to examine these to see if we expect duplicates in these features, or if we need to de-dupe the dataset.
No values are null or missing.
As part of our analysis, we will answer the following questions:
The following functions will assist us in our analysis.
# Import libraries we need
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler, StandardScaler
# The functions below are defined here so we can avoid repeating similar code for commonly used plots.
# These functions are called throughout this notebook.
# This function prints a barplot
def print_barplot(data, column, width, height, titleText, hue=None):
'''
Print a single, sorted barplot with title and labels and optional hue.
'''
# Set figure size and title
plt.figure(figsize=(width, height))
plt.title(titleText)
plt.xticks(rotation=90)
# Sort by values
data_sorted = data[column].value_counts().sort_values(ascending=False).reset_index()
data_sorted.columns = ['category', 'counts']
# Plotting
if hue is None:
sns.barplot(x='category', y='counts', data=data_sorted)
else:
sns.barplot(x='category', y='counts', hue=hue, data=data_sorted)
plt.xlabel('Categories')
plt.ylabel('Counts')
plt.show()
# This function creates a labeled barplot
def print_labeled_barplot(data, feature, perc=False, n=None):
'''
Barplot with percentage at the top
data: dataframe
feature: dataframe column
perc: whether to display percentages instead of count (default is False)
n: displays the top n category levels (default is None, i.e., display all levels)
'''
total = len(data[feature]) # length of the column
count = data[feature].nunique()
if n is None:
plt.figure(figsize=(count + 1, 5))
else:
plt.figure(figsize=(n + 1, 5))
plt.xticks(rotation=90, fontsize=15)
ax = sns.countplot(
data=data,
x=feature,
palette="Paired",
order=data[feature].value_counts().index[:n],
)
for p in ax.patches:
if perc == True:
label = "{:.1f}%".format(
100 * p.get_height() / total
) # percentage of each class of the category
else:
label = p.get_height() # count of each level of the category
x = p.get_x() + p.get_width() / 2 # width of the plot
y = p.get_height() # height of the plot
ax.annotate(
label,
(x, y),
ha="center",
va="center",
size=12,
xytext=(0, 5),
textcoords="offset points",
) # annotate the percentage
plt.show() # show the plot
# Function to plot a boxplot and a histogram on same scale
def histogram_boxplot(data, feature, figsize=(10, 5), kde=False, bins=None):
'''
Boxplot and histogram combined
data: dataframe
feature: dataframe column
figsize: size of figure (default (10,5))
kde: whether to the show density curve (default False)
bins: number of bins for histogram (default None)
'''
# Create a figure with two subplots
f2, (ax_box2, ax_hist2) = plt.subplots(
nrows=2, # Number of rows of the subplot grid
sharex=True, # x-axis will be shared among all subplots
gridspec_kw={'height_ratios': (0.25, 0.75)}, # Ratio of subplot heights
figsize=figsize, # Size of the figure
)
# Boxplot
sns.boxplot(
data=data, x=feature, ax=ax_box2, showmeans=True, color='blue'
)
ax_box2.set_title(f'Boxplot of {feature}') # Title for the boxplot
# Histogram
if bins:
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette='winter'
)
else:
sns.histplot(
data=data, x=feature, kde=kde, ax=ax_hist2
)
ax_hist2.axvline(
data[feature].mean(), color='green', linestyle="--", label='Mean'
)
ax_hist2.axvline(
data[feature].median(), color='black', linestyle="-", label='Median'
)
ax_hist2.set_title(f'Histogram of {feature}') # Title for the histogram
ax_hist2.legend() # Add legend to the histogram plot
# Adjust layout
plt.tight_layout()
plt.show()
# Function to calculate Z-score
def calculate_z_score(value, mean, std):
'''Calculate the Z-score for a single value '''
if std == 0:
return 0 # Avoid division by zero
return (value - mean) / std
# Function to print the statistics for each feature
def print_numerical_feature_statistics(data, feature):
'''
Show the statistics of a feature.
data: dataframe
feature: dataframe column
'''
# Calculate Shapiro-Wilks to test normal distribution
stat, p = shapiro(data[feature])
# Calculate mean and standard deviation
mean = data[feature].mean()
std = data[feature].std()
# Calculate Z-score for the mean
z_score = calculate_z_score(mean, mean, std)
new_stats = {
'Feature': feature,
'Mean': mean,
'Median': data[feature].median(),
'Min': data[feature].min(),
'Max': data[feature].max(),
'Range': data[feature].max() - data[feature].min(),
'Standard Deviation': std,
'Variance': data[feature].var(),
'P-value': p
}
new_stats_df = pd.DataFrame([new_stats])
return new_stats_df
First, we want to analyze each feature independently.
# Analyze critical statistics
# Create empty dataframe to store the vital statistics
stat_columns = [
'Feature',
'Mean',
'Median',
'Min',
'Max',
'Range',
'Standard Deviation',
'Variance',
'P-value']
df_stats = pd.DataFrame(columns=stat_columns)
# Drop ticker symbol and security for this purpose, since they are unique
df_eda.drop(columns=['Ticker Symbol','Security'], axis=1, inplace=True)
# Recreate numerical and categorical column lists
list_catcol = df_eda.select_dtypes(include=['object']).columns.to_list()
list_numcol = df_eda.select_dtypes(exclude=['object']).columns.to_list()
# Populate stats and show them
for feature in list_numcol:
new_stats_df = print_numerical_feature_statistics(df_eda.select_dtypes(include=['number']), feature)
df_stats = pd.concat([df_stats, new_stats_df], ignore_index=True)
# Set the display precision
pd.set_option('display.float_format', '{:.6f}'.format)
# Show statistics
df_stats
| Feature | Mean | Median | Min | Max | Range | Standard Deviation | Variance | P-value | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Current Price | 80.862345 | 59.705000 | 4.500000 | 1274.949951 | 1270.449951 | 98.055086 | 9614.799896 | 0.000000 |
| 1 | Price Change | 4.078194 | 4.819505 | -47.129693 | 55.051683 | 102.181377 | 12.006338 | 144.152149 | 0.000000 |
| 2 | Volatility | 1.525976 | 1.385593 | 0.733163 | 4.580042 | 3.846879 | 0.591798 | 0.350225 | 0.000000 |
| 3 | ROE | 39.597059 | 15.000000 | 1.000000 | 917.000000 | 916.000000 | 96.547538 | 9321.427130 | 0.000000 |
| 4 | Cash Ratio | 70.023529 | 47.000000 | 0.000000 | 958.000000 | 958.000000 | 90.421331 | 8176.017144 | 0.000000 |
| 5 | Net Cash Flow | 55537620.588235 | 2098000.000000 | -11208000000.000000 | 20764000000.000000 | 31972000000.000000 | 1946365312.175789 | 3788337928441157632.000000 | 0.000000 |
| 6 | Net Income | 1494384602.941176 | 707336000.000000 | -23528000000.000000 | 24442000000.000000 | 47970000000.000000 | 3940150279.327937 | 15524784223688019968.000000 | 0.000000 |
| 7 | Earnings Per Share | 2.776662 | 2.895000 | -61.200000 | 50.090000 | 111.290000 | 6.587779 | 43.398830 | 0.000000 |
| 8 | Estimated Shares Outstanding | 577028337.754030 | 309675137.800000 | 27672156.860000 | 6159292035.000000 | 6131619878.140000 | 845849595.417695 | 715461538068278272.000000 | 0.000000 |
| 9 | P/E Ratio | 32.612563 | 20.819876 | 2.935451 | 528.039074 | 525.103623 | 44.348731 | 1966.809957 | 0.000000 |
| 10 | P/B Ratio | -1.718249 | -1.067170 | -76.119077 | 129.064585 | 205.183663 | 13.966912 | 195.074625 | 0.000000 |
# Drop numerical columns for charting object histograms
df_eda_plot_cat = df_eda.copy().drop(columns=list_numcol, axis=1)
# Show sectors as a visual
# Create counts and percentages
value_counts = df_eda_plot_cat['GICS Sector'].value_counts().reset_index()
value_counts.columns = ['Category', 'Count']
total_counts = value_counts['Count'].sum()
value_counts['Percentage'] = (value_counts['Count'] / total_counts) * 100
value_counts_sorted = value_counts.sort_values(by='Count', ascending=False)
# Plot sector contributions
plt.figure(figsize=(12, 8))
bar_plot = sns.barplot(x='Category', y='Count', data=value_counts_sorted, order=value_counts_sorted['Category'], hue='Category')
plt.xticks(rotation=90)
plt.xlabel('Categories')
plt.ylabel('Counts')
plt.title('Sector Counts')
for p in bar_plot.patches:
count = p.get_height()
percentage = value_counts_sorted["Percentage"].iloc[int(p.get_x() + p.get_width() / 2)]
bar_plot.annotate(f'{count:.1f}\n({percentage:.1f}%)',
(p.get_x() + p.get_width() / 2., count),
ha='center', va='bottom')
# Print value contributions for each sector
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
display(value_counts_sorted)
# Since our dataset is small, we'll leave this display option intact for later
#pd.set_option('display.max_rows', 10)
#pd.set_option('display.max_columns', 10)
| Category | Count | Percentage | |
|---|---|---|---|
| 0 | Industrials | 53 | 15.588235 |
| 1 | Financials | 49 | 14.411765 |
| 2 | Health Care | 40 | 11.764706 |
| 3 | Consumer Discretionary | 40 | 11.764706 |
| 4 | Information Technology | 33 | 9.705882 |
| 5 | Energy | 30 | 8.823529 |
| 6 | Real Estate | 27 | 7.941176 |
| 7 | Utilities | 24 | 7.058824 |
| 8 | Materials | 20 | 5.882353 |
| 9 | Consumer Staples | 19 | 5.588235 |
| 10 | Telecommunications Services | 5 | 1.470588 |
# Show sub sector contributions as a visual
# Create counts and percentages
value_counts_sub = df_eda_plot_cat['GICS Sub Industry'].value_counts().reset_index()
value_counts_sub.columns = ['Category', 'Count']
total_counts_sub = value_counts_sub['Count'].sum()
value_counts_sub['Percentage'] = (value_counts_sub['Count'] / total_counts_sub) * 100
value_counts_sorted_sub = value_counts_sub.sort_values(by='Count', ascending=False)
# Plot sector contributions
plt.figure(figsize=(20, 8))
bar_plot = sns.barplot(x='Category', y='Count', data=value_counts_sorted_sub, order=value_counts_sorted_sub['Category'], hue='Category')
plt.xticks(rotation=90)
plt.xlabel('Categories')
plt.ylabel('Counts')
plt.title('Sector Counts')
plt.show()
# Print value contributions for each sub sector
print('Top 10 Sub Sectors')
display(value_counts_sorted_sub.head(10))
Top 10 Sub Sectors
| Category | Count | Percentage | |
|---|---|---|---|
| 0 | Oil & Gas Exploration & Production | 16 | 4.705882 |
| 2 | Industrial Conglomerates | 14 | 4.117647 |
| 1 | REITs | 14 | 4.117647 |
| 3 | Electric Utilities | 12 | 3.529412 |
| 4 | Internet Software & Services | 12 | 3.529412 |
| 5 | Health Care Equipment | 11 | 3.235294 |
| 6 | MultiUtilities | 11 | 3.235294 |
| 7 | Banks | 10 | 2.941176 |
| 8 | Property & Casualty Insurance | 8 | 2.352941 |
| 9 | Diversified Financial Services | 7 | 2.058824 |
print('Bottom Sub Sectors')
display(value_counts_sorted_sub[value_counts_sorted_sub['Percentage'] < .3])
Bottom Sub Sectors
| Category | Count | Percentage | |
|---|---|---|---|
| 93 | Electronic Equipment & Instruments | 1 | 0.294118 |
| 85 | Financial Exchanges & Data | 1 | 0.294118 |
| 86 | Water Utilities | 1 | 0.294118 |
| 87 | Home Entertainment Software | 1 | 0.294118 |
| 88 | Drug Retail | 1 | 0.294118 |
| 89 | Electrical Components & Equipment | 1 | 0.294118 |
| 90 | Semiconductor Equipment | 1 | 0.294118 |
| 91 | Multi-line Insurance | 1 | 0.294118 |
| 92 | Copper | 1 | 0.294118 |
| 96 | Tires & Rubber | 1 | 0.294118 |
| 94 | Diversified Commercial Services | 1 | 0.294118 |
| 95 | Consumer Electronics | 1 | 0.294118 |
| 83 | Agricultural Products | 1 | 0.294118 |
| 97 | Industrial Materials | 1 | 0.294118 |
| 98 | Motorcycle Manufacturers | 1 | 0.294118 |
| 99 | Technology Hardware, Storage & Peripherals | 1 | 0.294118 |
| 100 | Real Estate Services | 1 | 0.294118 |
| 101 | Trucking | 1 | 0.294118 |
| 102 | Networking Equipment | 1 | 0.294118 |
| 84 | Metal & Glass Containers | 1 | 0.294118 |
| 79 | Multi-Sector Holdings | 1 | 0.294118 |
| 82 | Distributors | 1 | 0.294118 |
| 70 | Human Resource & Employment Services | 1 | 0.294118 |
| 62 | Gold | 1 | 0.294118 |
| 63 | Environmental Services | 1 | 0.294118 |
| 64 | Home Furnishings | 1 | 0.294118 |
| 65 | Household Appliances | 1 | 0.294118 |
| 66 | Apparel, Accessories & Luxury Goods | 1 | 0.294118 |
| 67 | Specialty Retail | 1 | 0.294118 |
| 68 | Life Sciences Tools & Services | 1 | 0.294118 |
| 69 | Publishing | 1 | 0.294118 |
| 71 | Steel | 1 | 0.294118 |
| 81 | Computer Hardware | 1 | 0.294118 |
| 72 | Housewares & Specialties | 1 | 0.294118 |
| 73 | Thrifts & Mortgage Finance | 1 | 0.294118 |
| 74 | Brewers | 1 | 0.294118 |
| 75 | Technology, Hardware, Software and Supplies | 1 | 0.294118 |
| 76 | Personal Products | 1 | 0.294118 |
| 77 | Industrial Gases | 1 | 0.294118 |
| 78 | Office REITs | 1 | 0.294118 |
| 80 | Alternative Carriers | 1 | 0.294118 |
| 103 | Casinos & Gaming | 1 | 0.294118 |
# Drop categorical columns this time, so we can chart histograms and boxplots for the numerical features
df_eda_plot_num = df_eda.copy().drop(columns=list_catcol, axis=1)
# Analyze features
for feature in df_eda_plot_num.columns:
histogram_boxplot(df_eda_plot_num, feature, figsize=(8, 5), kde=True, bins=None)
OBSERVATIONS
The sectors break down as follows, in terms of sub sector industry security counts:
Many of the features exhibit skewness, while a few appear normal. Most have outliers. In addition, the ranges of data vary on scales that are quite different. We will likely need to scale our data and calculate z-scores later prior to clustering the data.
Right-skewed
Fairly normal
For Current Price :
For Price Change :
For Volatility :
For ROE :
For Cash Ratio :
For Net Cash Flow :
For Net Income :
For Earnings Per Share :
For Estimated Shares Outstanding :
For P/E Ratio :
For P/B Ratio :
The five most common GICS Sectors by percentage and counts, out of 340 total:
The five least common GICS Sectors by percentage and counts, out of 340 total:
The top 5 sub industry sectors are:
We can also see that 42 of the worst-performing sub industry sectors have contributions of less than .3%.
Next, let's identify any relationships between variables.
In order to do this, we need to ensure all variables can be compared. As we can see below, we have several categorical variables.
df_eda.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 340 entries, 0 to 339 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 GICS Sector 340 non-null object 1 GICS Sub Industry 340 non-null object 2 Current Price 340 non-null float64 3 Price Change 340 non-null float64 4 Volatility 340 non-null float64 5 ROE 340 non-null int64 6 Cash Ratio 340 non-null int64 7 Net Cash Flow 340 non-null int64 8 Net Income 340 non-null int64 9 Earnings Per Share 340 non-null float64 10 Estimated Shares Outstanding 340 non-null float64 11 P/E Ratio 340 non-null float64 12 P/B Ratio 340 non-null float64 dtypes: float64(7), int64(4), object(2) memory usage: 34.7+ KB
# Create new rankings for GICS Sector
unique_sectors = df_eda['GICS Sector'].unique()
ranked_categories = sorted(unique_sectors)
rank_mapping = {}
# Add a dictionary for friendly names
friendly_name_mapping = {}
# Populate mapping and rank
for rank, category in enumerate(ranked_categories):
rank_mapping[category] = rank + 1
friendly_name_mapping[category] = f"Rank {rank + 1}: {category}" # Example friendly name
# Map the rankings and create a new column in the DataFrame
df_eda['GICS Sector Int'] = df_eda['GICS Sector'].map(rank_mapping)
# Drop the non-numerical columns
cols_to_drop = ['GICS Sector', 'GICS Sub Industry']
df_eda_corr = df_eda.copy().drop(columns=cols_to_drop, axis=1)
# Preview new dataframe we will use for correlation heatmap
df_eda_corr
| Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | GICS Sector Int | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 42.349998 | 9.999995 | 1.687151 | 135 | 51 | -604000000 | 7610000000 | 11.390000 | 668129938.500000 | 3.718174 | -8.784219 | 6 |
| 1 | 59.240002 | 8.339433 | 2.197887 | 130 | 77 | 51000000 | 5144000000 | 3.150000 | 1633015873.000000 | 18.806350 | -8.750068 | 5 |
| 2 | 44.910000 | 11.301121 | 1.273646 | 21 | 67 | 938000000 | 4423000000 | 2.940000 | 1504421769.000000 | 15.275510 | -0.394171 | 5 |
| 3 | 93.940002 | 13.977195 | 1.357679 | 9 | 180 | -240840000 | 629551000 | 1.260000 | 499643650.800000 | 74.555557 | 4.199651 | 7 |
| 4 | 55.320000 | -1.827858 | 1.701169 | 14 | 272 | 315120000 | 696878000 | 0.310000 | 2247993548.000000 | 178.451613 | 1.059810 | 7 |
| 5 | 36.680000 | -12.017268 | 1.516493 | 10 | 49 | -189000000 | 1849000000 | 2.990000 | 618394648.800000 | 12.267559 | 7.496831 | 2 |
| 6 | 276.570007 | 6.189286 | 1.116976 | 30 | 25 | 90885000 | 596541000 | 8.910000 | 66951851.850000 | 31.040405 | 129.064585 | 7 |
| 7 | 43.230000 | 2.174424 | 1.124186 | 9 | 14 | 287000000 | 636000000 | 2.600000 | 244615384.600000 | 16.626923 | -0.719497 | 11 |
| 8 | 58.270000 | 2.371753 | 1.068485 | 11 | 9 | 13900000 | 2052300000 | 3.130000 | 421897810.200000 | 18.456543 | -3.022649 | 11 |
| 9 | 59.900002 | 3.027181 | 1.048295 | 14 | 99 | -308000000 | 2533000000 | 5.880000 | 430782312.900000 | 10.187075 | -1.883912 | 4 |
| 10 | 61.970001 | 8.358108 | 1.106965 | 2 | 99 | -129000000 | 2196000000 | 1.690000 | 1299408284.000000 | 36.668640 | -4.327138 | 4 |
| 11 | 40.029999 | 7.578608 | 1.163334 | 15 | 47 | 21818000 | 248710000 | 1.520000 | 163625000.000000 | 26.335526 | -1.269332 | 9 |
| 12 | 80.540001 | 1.897773 | 1.112604 | 3 | 99 | -30351000 | 141555000 | 2.080000 | 68055288.460000 | 38.721154 | -4.072615 | 4 |
| 13 | 40.939999 | -0.606943 | 1.052050 | 10 | 99 | 166000000 | 356800000 | 2.070000 | 172367149.800000 | 19.777777 | -9.855706 | 4 |
| 14 | 52.630001 | -23.790903 | 1.384502 | 10 | 225 | 50823000 | 321406000 | 1.800000 | 178558888.900000 | 29.238889 | 4.282358 | 7 |
| 15 | 56.009998 | 26.461948 | 1.974323 | 10 | 13 | -2276034000 | 334906000 | 3.010000 | 111264451.800000 | 18.607973 | -13.649723 | 8 |
| 16 | 80.510002 | 2.066436 | 1.773431 | 35 | 74 | -34000000 | 848000000 | 6.610000 | 128290469.000000 | 12.180031 | -1.114658 | 6 |
| 17 | 62.090000 | 6.592275 | 1.053266 | 11 | 99 | -162000000 | 2171000000 | 5.120000 | 424023437.500000 | 12.126953 | -4.327138 | 4 |
| 18 | 65.919998 | 13.753230 | 1.283795 | 601 | 45 | -90800000 | 153900000 | 1.600000 | 96187500.000000 | 41.199999 | -0.877453 | 6 |
| 19 | 190.750000 | 22.338380 | 2.022921 | 2 | 195 | 66000000 | 144000000 | 0.680000 | 211764705.900000 | 280.514706 | -14.171389 | 5 |
| 20 | 18.670000 | 26.834239 | 1.460030 | 18 | 131 | 1795000000 | 1377000000 | 1.130000 | 1218584071.000000 | 16.522124 | 3.857756 | 7 |
| 21 | 53.590000 | 2.212474 | 1.089266 | 18 | 37 | 3390000 | 590859000 | 2.460000 | 240186585.400000 | 21.784553 | -4.490342 | 6 |
| 22 | 159.759995 | -6.611335 | 2.093065 | 18 | 67 | 13200000 | 516000000 | 9.490000 | 54373024.240000 | 16.834562 | -31.015380 | 4 |
| 23 | 162.330002 | 17.163478 | 1.630259 | 25 | 362 | 413000000 | 6939000000 | 9.150000 | 758360655.700000 | 17.740984 | 24.012322 | 5 |
| 24 | 106.419998 | -2.420686 | 1.222260 | 22 | 67 | -281000000 | 1562000000 | 8.600000 | 181627907.000000 | 12.374418 | -13.398380 | 4 |
| 25 | 96.949997 | 10.233087 | 1.165804 | 10 | 39 | 7194000 | 685074000 | 1.420000 | 482446478.900000 | 68.274646 | -20.813577 | 9 |
| 26 | 675.890015 | 32.268105 | 1.460386 | 4 | 58 | 1333000000 | 596000000 | 1.280000 | 465625000.000000 | 528.039074 | 3.904430 | 1 |
| 27 | 59.660000 | 2.350316 | 1.480914 | 19 | 1 | -1300000 | 442600000 | 3.930000 | 112620865.100000 | 15.180662 | -7.970104 | 1 |
| 28 | 139.440002 | -0.620053 | 1.511654 | 11 | 70 | -38200000 | 2560000000 | 9.730000 | 263103802.700000 | 14.330935 | -31.006773 | 5 |
| 29 | 92.209999 | 3.910301 | 1.105032 | 23 | 99 | 10000000 | 1385000000 | 4.930000 | 280933062.900000 | 18.703854 | -7.759856 | 4 |
| 30 | 44.470001 | 11.397804 | 2.405408 | 917 | 80 | 698000000 | -23528000000 | -61.200000 | 384444444.400000 | 93.089287 | 4.970809 | 3 |
| 31 | 48.580002 | -20.802083 | 2.435165 | 52 | 22 | -6430000000 | -6692000000 | -13.180000 | 507738998.500000 | 93.089287 | -12.860938 | 3 |
| 32 | 52.230000 | 2.693667 | 1.007762 | 24 | 175 | 768300000 | 763500000 | 2.470000 | 309109311.700000 | 21.145749 | 8.202923 | 7 |
| 33 | 7.398807 | 1.647784 | 2.592065 | 3 | 37 | 42000000 | -322000000 | -0.310000 | 1038709677.000000 | 18.687607 | 2.639814 | 6 |
| 34 | 38.709999 | 23.319529 | 1.886335 | 11 | 70 | -3025000000 | 892000000 | 1.210000 | 737190082.600000 | 31.991735 | 0.290291 | 7 |
| 35 | 184.130005 | 4.857630 | 1.132875 | 8 | 47 | -108953000 | 741733000 | 5.540000 | 133886823.100000 | 33.236463 | -3.089477 | 9 |
| 36 | 145.149994 | 17.902683 | 1.847180 | 29 | 163 | 218000000 | 1364000000 | 2.860000 | 1044055944.000000 | 18.249407 | 3.954975 | 7 |
| 37 | 59.750000 | 8.596874 | 1.171525 | 9 | 4 | 22000000 | 476000000 | 2.660000 | 178947368.400000 | 22.462406 | -4.895294 | 11 |
| 38 | 69.550003 | -6.216290 | 0.900066 | 25 | 99 | 474000000 | 5163000000 | 3.900000 | 506660363.000000 | 10.263506 | -0.609074 | 4 |
| 39 | 144.589996 | 10.105078 | 1.155905 | 82 | 24 | -431000000 | 5176000000 | 7.520000 | 688297872.300000 | 19.227393 | 22.032612 | 6 |
| 40 | 16.830000 | 8.440722 | 1.418688 | 6 | 99 | 20764000000 | 15888000000 | 4.180000 | 845069512.200000 | 13.004785 | -0.938007 | 4 |
| 41 | 38.150002 | 16.702365 | 1.204526 | 11 | 128 | -712000000 | 968000000 | 1.780000 | 543820224.700000 | 21.432585 | 8.637045 | 5 |
| 42 | 37.810001 | 5.940045 | 1.077678 | 8 | 99 | 1386000000 | 2084000000 | 2.590000 | 804633204.600000 | 14.598456 | -0.852562 | 4 |
| 43 | 189.440002 | 1.549182 | 1.394436 | 9 | 82 | -9600000 | 135400000 | 1.800000 | 75222222.220000 | 105.244446 | -4.127770 | 5 |
| 44 | 46.150002 | -12.312367 | 2.559553 | 12 | 84 | 584000000 | -1967000000 | -4.490000 | 438084632.500000 | 93.089287 | 13.490544 | 3 |
| 45 | 306.350006 | 4.917982 | 1.825994 | 38 | 133 | 148900000 | 3547000000 | 15.380000 | 230624187.300000 | 19.918726 | 0.162602 | 5 |
| 46 | 41.220001 | 5.422003 | 1.201660 | 8 | 99 | -433000000 | 3158000000 | 2.730000 | 1156776557.000000 | 15.098901 | -3.321298 | 4 |
| 47 | 72.730003 | 16.535816 | 1.386684 | 22 | 10 | 32600000 | 280900000 | 2.050000 | 137024390.200000 | 35.478050 | -3.895657 | 8 |
| 48 | 68.790001 | 16.081680 | 1.498872 | 11 | 53 | -3186000000 | 1565000000 | 0.940000 | 1664893617.000000 | 73.180852 | 0.588026 | 5 |
| 49 | 18.440001 | 11.757582 | 1.491764 | 4 | 13 | -268000000 | -239000000 | -0.180000 | 1327777778.000000 | 31.468962 | -3.880921 | 5 |
| 50 | 43.230000 | 3.470563 | 2.058769 | 17 | 25 | -220100000 | 609700000 | 2.720000 | 224154411.800000 | 15.893382 | 1.044815 | 1 |
| 51 | 127.540001 | 7.203497 | 1.089469 | 10 | 47 | -1039361000 | 583106000 | 3.790000 | 153853825.900000 | 33.651715 | -1.269332 | 9 |
| 52 | 51.750000 | 4.714695 | 1.261984 | 8 | 99 | -11208000000 | 17242000000 | 5.410000 | 3187060998.000000 | 9.565619 | -1.167533 | 4 |
| 53 | 67.959999 | 3.550209 | 1.493553 | 17 | 25 | -881000000 | 2512000000 | 3.540000 | 709604519.800000 | 19.197740 | 6.264053 | 6 |
| 54 | 116.849998 | 13.193834 | 0.944847 | 10 | 99 | 1120000000 | 2834000000 | 8.710000 | 325373134.300000 | 13.415614 | -17.466101 | 4 |
| 55 | 34.580002 | 8.197757 | 1.297857 | 20 | 12 | -200481000 | 547132000 | 1.640000 | 333617073.200000 | 21.085367 | -3.415302 | 9 |
| 56 | 86.449997 | 9.569068 | 0.960191 | 21 | 36 | 3190000 | 1520992000 | 4.440000 | 342565765.800000 | 19.470720 | -10.666679 | 9 |
| 57 | 54.480000 | 8.938217 | 1.347239 | 7 | 20 | 1064000000 | 1757000000 | 2.260000 | 777433628.300000 | 24.106195 | -7.477166 | 1 |
| 58 | 119.760002 | 8.448793 | 2.000828 | 27 | 333 | 758700000 | 1602000000 | 2.020000 | 793069306.900000 | 59.287130 | -4.320051 | 5 |
| 59 | 40.810001 | -9.250611 | 2.368186 | 16 | 25 | -1710600000 | 664900000 | 2.970000 | 223872053.900000 | 13.740741 | -0.393528 | 8 |
| 60 | 26.190001 | 10.273688 | 1.189236 | 4 | 99 | -191000000 | 840000000 | 1.550000 | 541935483.900000 | 16.896775 | -0.063096 | 4 |
| 61 | 42.439999 | 1.047615 | 0.929026 | 20 | 38 | -93000000 | 410400000 | 3.130000 | 131118210.900000 | 13.559105 | -9.428134 | 2 |
| 62 | 4.500000 | -38.101788 | 4.559815 | 687 | 22 | -3283000000 | -14685000000 | -22.430000 | 654703522.100000 | 28.407929 | -1.840528 | 3 |
| 63 | 62.020000 | -9.008221 | 1.185473 | 44 | 12 | 39289000 | 509699000 | 3.520000 | 144800852.300000 | 17.619318 | 1.117804 | 6 |
| 64 | 183.100006 | 3.598507 | 1.697942 | 589 | 0 | 2000000 | -271000000 | -2.430000 | 111522633.700000 | 20.819876 | -76.119077 | 1 |
| 65 | 146.330002 | 8.682415 | 1.588398 | 17 | 70 | 548000000 | 2094000000 | 8.170000 | 256303549.600000 | 17.910649 | -8.805281 | 5 |
| 66 | 59.169998 | 9.777358 | 0.935812 | 10 | 99 | -47000000 | 634000000 | 3.870000 | 163824289.400000 | 15.289405 | -4.327138 | 4 |
| 67 | 66.620003 | 4.781379 | 0.895471 | 463 | 27 | -119000000 | 1384000000 | 1.530000 | 904575163.400000 | 43.542486 | -0.548324 | 2 |
| 68 | 41.830002 | 1.900129 | 1.557655 | 7 | 99 | 76000000 | 521000000 | 2.930000 | 177815699.700000 | 14.276451 | -0.063096 | 4 |
| 69 | 90.599998 | -2.402245 | 1.323348 | 6 | 99 | 326500000 | 1247000000 | 3.710000 | 336118598.400000 | 24.420485 | -58.649536 | 4 |
| 70 | 479.850006 | -33.131268 | 2.474002 | 22 | 237 | -171460000 | 475602000 | 15.300000 | 31085098.040000 | 31.362745 | 17.201329 | 1 |
| 71 | 88.010002 | -18.884791 | 1.472364 | 19 | 48 | -590000000 | 1399000000 | 7.860000 | 177989821.900000 | 11.197201 | 21.439428 | 6 |
| 72 | 36.080002 | 1.949712 | 1.037844 | 14 | 12 | 39000000 | 537000000 | 1.900000 | 282631578.900000 | 18.989475 | 0.063687 | 11 |
| 73 | 65.809998 | 21.712591 | 2.298696 | 16 | 70 | 150000000 | 355000000 | 2.990000 | 118729097.000000 | 22.010033 | -1.305493 | 5 |
| 74 | 18.360001 | 1.436470 | 1.389867 | 20 | 43 | -34000000 | -692000000 | -1.610000 | 429813664.600000 | 17.313076 | 0.516503 | 11 |
| 75 | 72.180000 | -0.619575 | 1.364592 | 9 | 99 | 781000000 | 4050000000 | 7.150000 | 566433566.400000 | 10.095105 | -0.725643 | 4 |
| 76 | 17.690001 | -20.099360 | 3.055818 | 6 | 0 | -20440000 | -113891000 | -0.280000 | 406753571.400000 | 93.089287 | -0.223147 | 3 |
| 77 | 134.199997 | -9.672215 | 1.555057 | 8 | 3 | -8796000 | 203523000 | 1.790000 | 140335195.500000 | 33.994412 | 0.855096 | 5 |
| 78 | 25.950001 | -4.349421 | 1.626219 | 17 | 74 | -41000000 | 1968000000 | 2.000000 | 984000000.000000 | 12.975001 | 0.902439 | 6 |
| 79 | 25.160000 | 0.159232 | 1.522194 | 6 | 3 | -2000000 | 878000000 | 1.580000 | 555696202.500000 | 15.924051 | -13.383212 | 10 |
| 80 | 60.020000 | -4.654489 | 1.338123 | 17 | 182 | 115100000 | 1623600000 | 2.670000 | 608089887.600000 | 22.479401 | 7.121644 | 7 |
| 81 | 75.650002 | 9.021477 | 1.968864 | 16 | 52 | 108369000 | 319361000 | 2.010000 | 158886069.700000 | 37.636817 | -1.765013 | 7 |
| 82 | 97.769997 | 1.326561 | 1.487367 | 14 | 11 | -22000000 | 5237000000 | 4.660000 | 1123819742.000000 | 20.980686 | -7.019809 | 2 |
| 83 | 89.959999 | 12.844955 | 1.750655 | 3 | 43 | -1763000000 | 4587000000 | 2.460000 | 1864634146.000000 | 36.569105 | 4.763937 | 3 |
| 84 | 92.860001 | -2.744032 | 2.692546 | 1 | 38 | 228529000 | 65900000 | 0.540000 | 122037037.000000 | 171.962965 | 5.673991 | 3 |
| 85 | 67.639999 | -3.988642 | 0.889931 | 15 | 8 | 289000000 | 1899000000 | 3.210000 | 591588785.000000 | 21.071651 | -7.604945 | 11 |
| 86 | 50.689999 | 13.375084 | 1.444219 | 42 | 31 | -116000000 | 4526000000 | 5.680000 | 796830985.900000 | 8.924296 | -16.730022 | 6 |
| 87 | 66.599998 | 37.489677 | 1.577881 | 20 | 60 | -1610000000 | 1953000000 | 2.170000 | 900000000.000000 | 30.691243 | 3.252222 | 8 |
| 88 | 76.269997 | 3.952561 | 1.551946 | 29 | 22 | 375200000 | 1940000000 | 4.030000 | 532235888.200000 | 14.842233 | 6.277287 | 6 |
| 89 | 53.619999 | 3.653584 | 1.159897 | 20 | 99 | 2288000000 | 2297000000 | 5.140000 | 446887159.500000 | 10.431906 | -0.375934 | 4 |
| 90 | 71.139999 | 15.674795 | 1.381490 | 15 | 11 | -59000000 | 709000000 | 4.920000 | 144105691.100000 | 14.459349 | -4.552214 | 5 |
| 91 | 70.416985 | 8.924595 | 1.191466 | 14 | 13 | -2214800000 | 3357400000 | 4.810000 | 698004158.000000 | 14.639706 | -13.759230 | 6 |
| 92 | 105.080002 | 2.049141 | 1.188454 | 19 | 26 | 848000000 | 8382000000 | 4.950000 | 1693333333.000000 | 21.228283 | -3.985039 | 1 |
| 93 | 26.680000 | 2.026769 | 1.689235 | 19 | 25 | 23000000 | 1034000000 | -2.430000 | 111522633.700000 | 20.819876 | -76.119077 | 1 |
| 94 | 25.219999 | 3.572891 | 1.812144 | 19 | 25 | 23000000 | 1034000000 | -2.430000 | 111522633.700000 | 20.819876 | -76.119077 | 1 |
| 95 | 85.730003 | 12.109326 | 1.440884 | 64 | 14 | -325000000 | 1450000000 | 5.080000 | 285433070.900000 | 16.875985 | -0.662152 | 1 |
| 96 | 75.620003 | 15.573900 | 1.070406 | 7 | 39 | 22239000 | 296689000 | 1.560000 | 190185256.400000 | 48.474361 | -7.489313 | 9 |
| 97 | 103.930000 | -1.188439 | 1.337924 | 15 | 38 | 46300000 | 168800000 | 4.680000 | 36068376.070000 | 22.207265 | -11.877441 | 6 |
| 98 | 61.310001 | 6.979585 | 1.507569 | 24 | 26 | -319396000 | 869829000 | 5.520000 | 157577717.400000 | 11.106884 | -2.293440 | 6 |
| 99 | 93.199997 | 18.049399 | 1.150797 | 35 | 58 | 683000000 | 764000000 | 4.000000 | 191000000.000000 | 23.299999 | -12.717277 | 2 |
| 100 | 71.389999 | -0.833448 | 1.096727 | 7 | 8 | -1179000000 | 2816000000 | 4.050000 | 695308642.000000 | 17.627160 | -4.426811 | 11 |
| 101 | 69.709999 | -3.622291 | 1.211643 | 6 | 79 | 533875000 | 269732000 | 1.270000 | 212387401.600000 | 54.889763 | 1.962527 | 5 |
| 102 | 32.000000 | -15.478079 | 2.923698 | 205 | 70 | 830000000 | -14454000000 | -35.550000 | 406582278.500000 | 93.089287 | 1.785616 | 3 |
| 103 | 27.480000 | 12.163265 | 1.409302 | 26 | 271 | -4496000000 | 1725000000 | 1.430000 | 1206293706.000000 | 19.216783 | 4.601699 | 7 |
| 104 | 114.379997 | 3.783684 | 1.078516 | 15 | 2 | -116800000 | 1002100000 | 3.380000 | 296479289.900000 | 33.840236 | -14.928867 | 8 |
| 105 | 64.269997 | -3.974306 | 1.068002 | 9 | 20 | 249000000 | 1193000000 | 4.070000 | 293120393.100000 | 15.791154 | -3.022649 | 11 |
| 106 | 111.370003 | 14.531063 | 1.081040 | 19 | 15 | -35000000 | 429100000 | 3.610000 | 118864265.900000 | 30.850416 | -8.116821 | 6 |
| 107 | 59.209999 | -6.135071 | 0.927260 | 10 | 3 | 29000000 | 1117000000 | 3.130000 | 356869009.600000 | 18.916933 | -6.369284 | 11 |
| 108 | 67.510002 | 3.654238 | 1.404508 | 22 | 14 | 79000000 | 848000000 | 5.710000 | 148511383.500000 | 11.823118 | -12.308821 | 8 |
| 109 | 70.790001 | -4.078593 | 1.941104 | 35 | 39 | -1368707000 | -4524515000 | -8.290000 | 545779855.200000 | 93.089287 | 1.416243 | 3 |
| 110 | 302.399994 | 10.019650 | 1.308082 | 7 | 164 | 1617921000 | 187774000 | 3.250000 | 57776615.380000 | 93.046152 | 23.856728 | 9 |
| 111 | 81.589996 | 8.037605 | 1.056186 | 8 | 47 | 2196000 | 870120000 | 2.370000 | 367139240.500000 | 34.426159 | -1.269332 | 9 |
| 112 | 52.130001 | -21.253771 | 2.364883 | 2 | 201 | 523803000 | 85171000 | 0.560000 | 152091071.400000 | 93.089287 | 9.567952 | 3 |
| 113 | 51.070000 | 0.709921 | 1.232829 | 8 | 1 | -14756000 | 878485000 | 2.770000 | 317142599.300000 | 18.436823 | -1.169833 | 11 |
| 114 | 239.410004 | 6.765073 | 1.118425 | 4 | 47 | 4073000 | 232120000 | 3.500000 | 66320000.000000 | 68.402858 | -3.089477 | 9 |
| 115 | 29.639999 | 12.656785 | 1.452048 | 5 | 99 | 450000000 | 268000000 | 0.920000 | 291304347.800000 | 32.217390 | -0.597313 | 4 |
| 116 | 52.040001 | 1.166411 | 1.521430 | 13 | 10 | -513000000 | 1979000000 | 4.250000 | 465647058.800000 | 12.244706 | -8.639591 | 6 |
| 117 | 68.360001 | 4.910983 | 1.217401 | 2 | 44 | -71065000 | -156734000 | -0.990000 | 158317171.700000 | 18.456543 | 6.174024 | 11 |
| 118 | 78.980003 | 11.616734 | 1.666482 | 20 | 257 | 64600000 | 494900000 | 2.300000 | 215173913.000000 | 34.339132 | 6.349747 | 5 |
| 119 | 27.770000 | -6.403775 | 1.351595 | 9 | 74 | 4624000000 | 2269000000 | 2.685000 | 299887089.100000 | 17.313076 | -1.715880 | 11 |
| 120 | 45.099998 | -4.449159 | 1.062553 | 27 | 94 | -119311000 | 457223000 | 2.420000 | 188935124.000000 | 18.636363 | 5.991459 | 6 |
| 121 | 124.300003 | 4.894517 | 1.578747 | 16 | 29 | 273599000 | 764465000 | 5.870000 | 130232538.300000 | 21.175469 | -44.103494 | 1 |
| 122 | 88.209999 | 13.922251 | 1.186059 | 19 | 39 | 28136000 | 394950000 | 1.580000 | 249968354.400000 | 55.829113 | -14.151445 | 9 |
| 123 | 14.090000 | 2.398256 | 1.151454 | 26 | 43 | 3515000000 | 7373000000 | 1.860000 | 3963978495.000000 | 7.575269 | 5.108756 | 1 |
| 124 | 40.820000 | 10.984234 | 1.411396 | 29 | 35 | 14523000 | 516361000 | 1.770000 | 291729378.500000 | 23.062147 | 4.427425 | 6 |
| 125 | 104.660004 | 16.224320 | 1.320606 | 8 | 958 | 592000000 | 3669000000 | 1.310000 | 2800763359.000000 | 79.893133 | 5.884467 | 7 |
| 126 | 55.500000 | 16.817517 | 1.348297 | 13 | 31 | 46600000 | 315000000 | 1.970000 | 159898477.200000 | 28.172589 | -2.100708 | 6 |
| 127 | 6.770000 | -31.685167 | 3.796410 | 155 | 5 | -240000000 | -12156000000 | -11.310000 | 1074801061.000000 | 22.811951 | 2.935427 | 8 |
| 128 | 31.730000 | 1.179844 | 1.238785 | 5 | 2 | 46000000 | 578000000 | 1.370000 | 421897810.200000 | 23.160584 | -6.072561 | 11 |
| 129 | 60.599998 | -10.553508 | 1.148295 | 7 | 29 | 194800000 | 650800000 | 2.220000 | 293153153.200000 | 27.297296 | -19.086610 | 7 |
| 130 | 91.459999 | 5.235295 | 0.904487 | 27 | 18 | -19000000 | 712000000 | 3.040000 | 234210526.300000 | 30.085526 | -7.975730 | 7 |
| 131 | 28.070000 | 0.214209 | 1.761193 | 15 | 81 | -58589000 | 241686000 | 1.730000 | 139702890.200000 | 16.225434 | 4.014713 | 7 |
| 132 | 47.220001 | 10.819056 | 1.774454 | 14 | 73 | -43239000 | 412512000 | 2.850000 | 144741052.600000 | 16.568421 | 14.992623 | 6 |
| 133 | 42.080002 | 2.210357 | 1.781661 | 16 | 27 | -83906000 | 267669000 | 2.010000 | 133168656.700000 | 20.935324 | 6.746760 | 6 |
| 134 | 39.130001 | 15.088238 | 2.175738 | 26 | 5 | -30900000 | 489000000 | 3.660000 | 133606557.400000 | 10.691257 | 5.101546 | 8 |
| 135 | 146.100006 | 6.806063 | 1.239858 | 13 | 47 | -26905000 | 210219000 | 3.040000 | 69150986.840000 | 48.059213 | -3.973395 | 9 |
| 136 | 65.989998 | 55.051683 | 2.075216 | 10 | 190 | -355228000 | 546421000 | 5.420000 | 100815682.700000 | 12.175276 | 22.563791 | 7 |
| 137 | 4.670000 | -2.301255 | 2.026818 | 3 | 496 | 254000000 | -196000000 | -0.290000 | 675862069.000000 | 14.518987 | 10.497704 | 10 |
| 138 | 137.360001 | -0.463767 | 0.939544 | 28 | 22 | -1603000000 | 2965000000 | 9.230000 | 321235102.900000 | 14.881907 | 4.242998 | 6 |
| 139 | 27.209999 | 4.212937 | 1.390342 | 17 | 47 | -15576000 | 1374561000 | 3.040000 | 158299350.600000 | 47.366973 | -3.973395 | 9 |
| 140 | 101.190002 | 2.689264 | 1.494060 | 98 | 148 | 2824000000 | 18108000000 | 12.370000 | 1463864188.000000 | 8.180275 | 3.159446 | 5 |
| 141 | 18.280001 | 6.588927 | 1.578483 | 7 | 163 | -809000000 | 1339000000 | 1.020000 | 1312745098.000000 | 17.921570 | 3.617610 | 7 |
| 142 | 34.009998 | 12.281271 | 1.344514 | 24 | 33 | -3857000000 | 9687000000 | 6.110000 | 1585433715.000000 | 5.566284 | -4.892037 | 1 |
| 143 | 85.889999 | 4.033432 | 1.177027 | 22 | 5 | 73901000 | 705672000 | 4.650000 | 151757419.400000 | 18.470968 | 7.205242 | 1 |
| 144 | 37.169998 | 3.393594 | 1.665475 | 14 | 121 | -363198000 | 456227000 | 2.390000 | 190889958.200000 | 15.552300 | 5.760057 | 1 |
| 145 | 32.669998 | 10.446241 | 1.522778 | 8 | 30 | -685000000 | 307000000 | 1.140000 | 269298245.600000 | 28.657893 | 3.835896 | 1 |
| 146 | 202.589996 | -5.336199 | 1.348597 | 34 | 16 | 63492000 | 768996000 | 11.690000 | 65782378.100000 | 17.330196 | 12.112879 | 6 |
| 147 | 34.040001 | -5.101751 | 1.966062 | 4 | 189 | 7786000000 | -671000000 | -0.790000 | 849367088.600000 | 93.089287 | 17.345857 | 3 |
| 148 | 67.360001 | -7.076834 | 1.583355 | 27 | 92 | 83583000 | 451838000 | 3.610000 | 125162880.900000 | 18.659280 | 12.145326 | 1 |
| 149 | 11.060000 | 4.143126 | 1.337793 | 11 | 99 | -373409000 | 692957000 | 0.820000 | 845069512.200000 | 13.487805 | -0.065057 | 4 |
| 150 | 67.629997 | -12.532337 | 1.914907 | 28 | 13 | 175000000 | 2129000000 | 5.140000 | 414202334.600000 | 13.157587 | -7.279051 | 5 |
| 151 | 68.029999 | 0.044116 | 1.341731 | 6 | 47 | -112818000 | 849073000 | 2.350000 | 361307659.600000 | 28.948936 | -3.738047 | 9 |
| 152 | 34.826959 | 2.218658 | 1.282286 | 6 | 47 | 162690000 | -559235000 | -1.210000 | 462177686.000000 | 33.651715 | -1.269332 | 9 |
| 153 | 48.480000 | -4.585713 | 2.398580 | 16 | 103 | 272000000 | -3056000000 | -10.780000 | 283487940.600000 | 28.407929 | 6.264817 | 3 |
| 154 | 43.459999 | -5.005467 | 1.147332 | 10 | 99 | 49000000 | 1682000000 | 4.050000 | 415308642.000000 | 10.730864 | -4.327138 | 4 |
| 155 | 45.389999 | -17.247036 | 1.560372 | 41 | 31 | -184471000 | 752207000 | 3.710000 | 202751212.900000 | 12.234501 | 6.069389 | 1 |
| 156 | 103.570000 | 9.320247 | 1.103449 | 26 | 41 | -1504000000 | 4768000000 | 6.110000 | 780360065.500000 | 16.950900 | -3.709826 | 6 |
| 157 | 15.200000 | -17.837838 | 3.400491 | 7 | 45 | 7523000000 | 2461000000 | 1.800000 | 513987730.100000 | 25.309524 | 3.954975 | 7 |
| 158 | 11.840000 | 2.161759 | 2.373359 | 16 | 18 | 2300000000 | 4554000000 | 1.800000 | 513987730.100000 | 25.309524 | 3.954975 | 7 |
| 159 | 39.540001 | 24.496225 | 1.078455 | 17 | 29 | 13065000 | 686088000 | 3.280000 | 199237804.900000 | 24.070121 | -1.980483 | 2 |
| 160 | 158.190002 | 18.317132 | 1.013922 | 17 | 3 | -17388000 | 479058000 | 5.780000 | 82882006.920000 | 27.368512 | 5.925677 | 5 |
| 161 | 15.340000 | -3.217666 | 1.594628 | 8 | 47 | -445000000 | 558000000 | 0.220000 | 2536363636.000000 | 69.727273 | -0.113548 | 9 |
| 162 | 89.269997 | -3.261814 | 1.188383 | 51 | 16 | -28325000 | 512951000 | 3.280000 | 199237804.900000 | 24.070121 | -1.980483 | 2 |
| 163 | 178.509995 | -0.145443 | 1.615206 | 12 | 70 | 636000000 | 1276000000 | 8.540000 | 149414519.900000 | 20.902810 | -8.805281 | 5 |
| 164 | 137.619995 | -5.292136 | 1.082881 | 92 | 24 | -790000000 | 13190000000 | 13.480000 | 978486646.900000 | 10.209198 | 4.852391 | 7 |
| 165 | 72.919998 | -1.565880 | 1.469586 | 228 | 40 | -193542000 | 192078000 | 2.070000 | 92791304.350000 | 35.227052 | -0.981083 | 5 |
| 166 | 119.639999 | 14.961083 | 1.152855 | 26 | 25 | -296585000 | 419247000 | 5.190000 | 80779768.790000 | 23.052023 | 5.047700 | 8 |
| 167 | 34.450001 | 14.035095 | 1.226022 | 19 | 162 | 12747000000 | 11420000000 | 2.410000 | 4738589212.000000 | 14.294606 | 3.954975 | 7 |
| 168 | 37.700001 | -0.026513 | 1.301630 | 24 | 27 | -831000000 | 938000000 | 2.250000 | 416888888.900000 | 16.755556 | 6.123934 | 8 |
| 169 | 23.280001 | 21.821035 | 1.139799 | 23 | 20 | -157700000 | 454600000 | 1.110000 | 409549549.600000 | 20.972974 | 0.265658 | 1 |
| 170 | 27.010000 | -13.067267 | 1.301382 | 24 | 15 | 2448000 | 123241000 | 0.580000 | 212484482.800000 | 46.568966 | -2.763651 | 9 |
| 171 | 546.159973 | 18.733013 | 1.126009 | 14 | 317 | 114300000 | 588800000 | 15.870000 | 37101449.280000 | 34.414617 | 42.607500 | 5 |
| 172 | 92.680000 | 12.776831 | 1.142869 | 36 | 130 | -900000000 | 1899000000 | 5.160000 | 368023255.800000 | 17.961240 | 7.586477 | 6 |
| 173 | 33.480000 | 7.067477 | 1.580839 | 12 | 67 | 412000000 | 968100000 | 2.260000 | 428362831.900000 | 14.814159 | 4.218620 | 4 |
| 174 | 73.360001 | 2.961405 | 1.218373 | 33 | 1 | -395000 | 427235000 | 3.690000 | 115781842.800000 | 19.880759 | 2.823845 | 6 |
| 175 | 41.950001 | 11.539484 | 1.732990 | 7 | 23 | -271788000 | 302971000 | 2.420000 | 125194628.100000 | 17.334711 | 6.294943 | 6 |
| 176 | 27.600000 | 7.351229 | 1.841767 | 14 | 108 | -218700000 | 633700000 | 1.620000 | 391172839.500000 | 17.037037 | 2.752236 | 7 |
| 177 | 66.029999 | 8.033377 | 1.130337 | 10 | 99 | -7341000000 | 24442000000 | 6.050000 | 4040000000.000000 | 10.914049 | -1.886881 | 4 |
| 178 | 26.459999 | 8.709938 | 1.224688 | 18 | 47 | 2212000 | 894115000 | 2.010000 | 444833333.300000 | 13.164179 | -1.269332 | 9 |
| 179 | 127.300003 | 17.511309 | 0.870405 | 582 | 10 | -170000000 | 1013000000 | 2.780000 | 364388489.200000 | 45.791368 | -2.533011 | 2 |
| 180 | 14.920000 | -47.129693 | 3.139352 | 1 | 7 | -86000000 | 253000000 | 0.100000 | 2530000000.000000 | 149.200000 | -1.894071 | 3 |
| 181 | 42.959999 | 6.812526 | 0.889913 | 29 | 74 | -1649000000 | 7351000000 | 1.690000 | 4349704142.000000 | 25.420118 | -1.466307 | 2 |
| 182 | 74.669998 | -18.438017 | 2.071639 | 12 | 18 | -211400000 | 483500000 | 4.410000 | 109637188.200000 | 16.931972 | -2.012091 | 6 |
| 183 | 42.020000 | 1.965545 | 1.204037 | 30 | 36 | -79600000 | 329200000 | 2.310000 | 142510822.500000 | 18.190476 | 2.931005 | 6 |
| 184 | 48.910000 | 1.705136 | 1.569167 | 14 | 25 | -123369000 | 802894000 | 3.870000 | 207466149.900000 | 12.638243 | -0.307832 | 1 |
| 185 | 123.639999 | 14.174899 | 1.603130 | 9 | 42 | 136400000 | 436900000 | 5.030000 | 178246546.300000 | 15.900937 | -1.294844 | 5 |
| 186 | 29.629999 | 4.441304 | 1.427237 | 14 | 12 | -27208000 | 423223000 | 3.030000 | 213598256.500000 | 20.819876 | -0.857290 | 1 |
| 187 | 119.510002 | 14.539013 | 1.513434 | 6 | 7 | -235000000 | -240000000 | -2.970000 | 80808080.810000 | 17.334711 | 14.280750 | 6 |
| 188 | 84.260002 | 0.789478 | 1.440622 | 17 | 54 | -205200000 | 2408400000 | 2.270000 | 1060969163.000000 | 37.118944 | -0.651103 | 5 |
| 189 | 217.149994 | 5.254227 | 0.903098 | 116 | 8 | -356000000 | 3605000000 | 11.620000 | 310240963.900000 | 18.687607 | -10.852854 | 6 |
| 190 | 31.225001 | 6.642759 | 1.115842 | 10 | 0 | -51100000 | 388400000 | 3.360000 | 115595238.100000 | 9.293155 | -4.606591 | 11 |
| 191 | 17.389999 | -14.292764 | 1.554235 | 2 | 81 | -638127000 | 252111000 | 0.740000 | 340690540.500000 | 23.499999 | 19.821416 | 4 |
| 192 | 43.060001 | 13.855106 | 1.536290 | 30 | 41 | 301000000 | 2181000000 | 3.300000 | 660909090.900000 | 13.048485 | -5.117194 | 6 |
| 193 | 54.360001 | 24.707504 | 1.457013 | 34 | 61 | 274000000 | 3433000000 | 9.710000 | 353553038.100000 | 5.598352 | -2.245773 | 10 |
| 194 | 86.900002 | 2.573183 | 1.609745 | 68 | 46 | -107000000 | 4476000000 | 9.620000 | 465280665.300000 | 9.033264 | 10.316354 | 8 |
| 195 | 97.360001 | 7.496965 | 1.095876 | 63 | 130 | 610000000 | 3808000000 | 3.360000 | 1133333333.000000 | 28.976191 | 3.451765 | 7 |
| 196 | 90.809998 | 10.622490 | 1.177776 | 12 | 47 | 10906000 | 350745000 | 4.410000 | 79534013.610000 | 20.591836 | -3.089477 | 9 |
| 197 | 80.690002 | 4.183351 | 1.169328 | 10 | 47 | 1603000 | 487562000 | 3.080000 | 158299350.600000 | 26.198053 | -3.973395 | 9 |
| 198 | 67.040001 | -1.973970 | 1.642450 | 24 | 3 | -8000000 | 859000000 | 3.220000 | 266770186.300000 | 20.819876 | -12.370198 | 1 |
| 199 | 28.299999 | 11.637077 | 1.428359 | 263 | 61 | 85000000 | 355000000 | 1.030000 | 344660194.200000 | 27.475727 | 2.219577 | 6 |
| 200 | 27.170000 | 30.625006 | 1.921708 | 14 | 54 | -78836000 | 369416000 | 1.080000 | 342051851.900000 | 25.157407 | 4.535251 | 1 |
| 201 | 118.139999 | 19.939085 | 0.733163 | 64 | 260 | 5607600000 | 4529300000 | 4.820000 | 939688796.700000 | 24.510373 | 7.122145 | 1 |
| 202 | 100.339996 | 2.345976 | 1.268800 | 167 | 183 | 537900000 | 941300000 | 4.700000 | 200276595.700000 | 21.348935 | 8.615585 | 4 |
| 203 | 44.840000 | 6.079962 | 1.321548 | 26 | 17 | 239000000 | 7267000000 | 4.490000 | 1618485523.000000 | 9.986637 | -12.809506 | 2 |
| 204 | 48.209999 | 1.366690 | 1.138650 | 8 | 99 | 1944000000 | 5310000000 | 4.610000 | 1151843818.000000 | 10.457700 | -1.883912 | 4 |
| 205 | 189.389999 | 3.514425 | 1.492478 | 13 | 3 | -16185000 | 615302000 | 2.590000 | 237568339.800000 | 73.123552 | -3.980316 | 1 |
| 206 | 78.949997 | 12.081196 | 1.718403 | 103 | 136 | 403700000 | 653500000 | 3.280000 | 199237804.900000 | 24.070121 | 6.495755 | 2 |
| 207 | 85.559998 | 6.976738 | 1.032221 | 24 | 9 | 35300000 | 401600000 | 3.140000 | 127898089.200000 | 27.248407 | -1.980483 | 2 |
| 208 | 136.580002 | -10.866015 | 2.164150 | 7 | 46 | 59758000 | 288792000 | 4.310000 | 67005104.410000 | 31.689096 | 3.050887 | 8 |
| 209 | 55.450001 | 6.022948 | 1.034162 | 25 | 99 | -584000000 | 1599000000 | 3.010000 | 531229235.900000 | 18.421927 | -1.950194 | 4 |
| 210 | 150.639999 | 5.927847 | 0.982698 | 42 | 27 | -99000000 | 4833000000 | 7.720000 | 626036269.400000 | 19.512953 | 2.023844 | 6 |
| 211 | 49.653332 | 10.800357 | 1.585944 | 11 | 568 | 1805094000 | 546733000 | 3.710000 | 1469541779.000000 | 25.420118 | -5.190734 | 2 |
| 212 | 58.209999 | 6.885788 | 0.959008 | 182 | 33 | -952000000 | 5241000000 | 2.670000 | 1962921348.000000 | 21.801498 | -6.632971 | 2 |
| 213 | 27.590000 | -11.229086 | 2.830675 | 10 | 62 | -1098300000 | 1000400000 | 2.790000 | 358566308.200000 | 9.888889 | 5.846617 | 8 |
| 214 | 51.840000 | 11.507846 | 1.989371 | 22 | 18 | -367000000 | 2852000000 | 5.290000 | 539130434.800000 | 9.799622 | 5.798226 | 3 |
| 215 | 52.820000 | 7.031413 | 1.278460 | 10 | 70 | 1083000000 | 4442000000 | 1.580000 | 2811392405.000000 | 33.430380 | -4.282931 | 5 |
| 216 | 12.590000 | -20.265991 | 3.325386 | 12 | 71 | -1177000000 | -2204000000 | -3.260000 | 676073619.600000 | 93.089287 | 1.273530 | 3 |
| 217 | 121.180000 | -0.361785 | 1.380390 | 7 | 99 | -5317000 | 1079667000 | 7.220000 | 149538365.700000 | 16.783934 | -0.938007 | 4 |
| 218 | 339.130005 | 18.942905 | 1.115376 | 61 | 18 | 13624000 | 352820000 | 12.750000 | 27672156.860000 | 26.598432 | 4.403994 | 5 |
| 219 | 22.450001 | -8.591197 | 2.851180 | 43 | 27 | -910125000 | -2270833000 | -13.030000 | 174277283.200000 | 28.407929 | -1.298006 | 3 |
| 220 | 54.070000 | 33.177346 | 2.299304 | 9 | 30 | 1010500000 | 847600000 | 1.580000 | 1633015873.000000 | 33.430380 | -4.213309 | 5 |
| 221 | 11.450000 | 1.868327 | 2.230827 | 25 | 99 | 151000000 | 997000000 | 2.660000 | 374812030.100000 | 4.304511 | -1.880943 | 4 |
| 222 | 32.930000 | 7.298791 | 2.509437 | 24 | 57 | -155000000 | -2441000000 | -6.070000 | 402141680.400000 | 93.089287 | 1.171229 | 3 |
| 223 | 58.169998 | 8.810324 | 1.563258 | 8 | 117 | -126000000 | 428000000 | 2.560000 | 167187500.000000 | 22.722655 | -11.717383 | 4 |
| 224 | 103.889999 | 6.237855 | 1.023375 | 12 | 6 | -6000000 | 2752000000 | 6.110000 | 450409165.300000 | 17.003273 | -7.353314 | 11 |
| 225 | 17.990000 | 10.844116 | 2.536050 | 2 | 198 | 379000000 | 220000000 | 0.430000 | 511627907.000000 | 41.837209 | 6.971864 | 8 |
| 226 | 114.379997 | 11.145654 | 2.605949 | 6 | 65 | 695722000 | 122641000 | 0.290000 | 422900000.000000 | 394.413783 | -5.700168 | 7 |
| 227 | 32.560001 | -3.296695 | 2.421529 | 244 | 1 | -9000000 | -3362000000 | -21.180000 | 158734655.300000 | 93.089287 | -0.138596 | 3 |
| 228 | 46.599998 | 4.931317 | 1.198493 | 13 | 21 | 84000000 | 570000000 | 1.550000 | 367741935.500000 | 30.064515 | -12.375526 | 6 |
| 229 | 33.490002 | -12.558739 | 1.952020 | 5 | 49 | -1456000000 | -769000000 | -1.990000 | 386432160.800000 | 93.089287 | 9.582536 | 3 |
| 230 | 84.589996 | 9.529966 | 2.168814 | 13 | 49 | 128000000 | 1556000000 | 5.130000 | 303313840.200000 | 16.489278 | 0.926433 | 6 |
| 231 | 72.089996 | 5.796884 | 1.281566 | 11 | 67 | 3394000000 | 973800000 | 4.030000 | 241637717.100000 | 17.888336 | -13.398380 | 4 |
| 232 | 40.299999 | 6.585554 | 1.460619 | 5 | 147 | 915325000 | 357659000 | 1.110000 | 322215315.300000 | 36.306305 | 11.168107 | 8 |
| 233 | 44.080002 | 9.980039 | 1.641300 | 19 | 14 | 75400000 | 350000000 | 1.300000 | 269230769.200000 | 33.907694 | -2.075543 | 1 |
| 234 | 51.630001 | 8.420836 | 1.104581 | 4 | 47 | 36442000 | 283766000 | 1.090000 | 260335779.800000 | 47.366973 | -3.973395 | 9 |
| 235 | 24.660000 | -24.123077 | 3.560178 | 73 | 6 | -75150000 | 244977000 | 1.170000 | 209382051.300000 | 21.076923 | -8.025110 | 3 |
| 236 | 75.660004 | 14.810321 | 1.066369 | 45 | 18 | 217100000 | 1093900000 | 4.430000 | 246930022.600000 | 17.079008 | -10.464098 | 1 |
| 237 | 253.419998 | 0.964143 | 1.089370 | 47 | 4 | -134259000 | 931216000 | 9.320000 | 99915879.830000 | 27.190987 | -0.364026 | 1 |
| 238 | 67.610001 | 0.865287 | 1.589520 | 32 | 64 | -588000000 | -7829000000 | -10.230000 | 765298142.700000 | 93.089287 | 3.345102 | 3 |
| 239 | 16.150000 | 3.128991 | 1.132813 | 5 | 99 | -298400000 | 260100000 | 0.860000 | 302441860.500000 | 18.779070 | -0.426859 | 4 |
| 240 | 20.650000 | 3.821021 | 1.259611 | 228 | 34 | -403561000 | 407943000 | 2.040000 | 199972058.800000 | 10.122549 | -0.733207 | 6 |
| 241 | 47.400002 | -9.317004 | 1.439564 | 23 | 110 | 278800000 | 1604000000 | 4.520000 | 354867256.600000 | 10.486726 | 6.290521 | 6 |
| 242 | 53.189999 | 0.510206 | 1.039803 | 5 | 6 | -28000000 | 888000000 | 1.810000 | 490607734.800000 | 29.386740 | -1.121059 | 11 |
| 243 | 1274.949951 | 3.190527 | 1.268340 | 29 | 184 | -1671386000 | 2551360000 | 50.090000 | 50935516.070000 | 25.453183 | -1.052429 | 1 |
| 244 | 38.689999 | -8.230553 | 1.180661 | 13 | 11 | -8000000 | 1679000000 | 3.320000 | 505722891.600000 | 11.653614 | -0.361858 | 11 |
| 245 | 99.919998 | 6.072188 | 0.805357 | 45 | 68 | 2962000000 | 5452000000 | 3.710000 | 1469541779.000000 | 26.932614 | -5.190734 | 2 |
| 246 | 32.279999 | 3.130991 | 1.238748 | 11 | 79 | 298000000 | 6960000000 | 1.130000 | 6159292035.000000 | 28.566371 | -4.213309 | 5 |
| 247 | 44.980000 | -5.305263 | 1.528985 | 13 | 117 | 700900000 | 1234000000 | 4.110000 | 300243309.000000 | 10.944039 | -4.526995 | 4 |
| 248 | 79.410004 | 10.660538 | 0.806056 | 17 | 129 | 160383000 | 636056000 | 3.280000 | 491391569.000000 | 24.070121 | -2.256747 | 2 |
| 249 | 31.799999 | 3.515625 | 1.086898 | 17 | 99 | 116000000 | 1267600000 | 2.160000 | 586851851.900000 | 14.722222 | -0.843313 | 4 |
| 250 | 17.820000 | -5.564393 | 1.694751 | 10 | 25 | -533785000 | 494090000 | 1.380000 | 358036231.900000 | 12.913043 | -0.307832 | 1 |
| 251 | 87.910004 | 10.328820 | 0.861453 | 52 | 22 | 1735000000 | 6873000000 | 4.420000 | 1554977376.000000 | 19.889141 | -1.418027 | 2 |
| 252 | 95.309998 | 6.993709 | 1.120534 | 9 | 99 | -295000000 | 4106000000 | 7.520000 | 546010638.300000 | 12.674202 | -0.694126 | 4 |
| 253 | 49.529999 | -3.034462 | 1.875910 | 2 | 8 | 15900000 | -76400000 | -0.420000 | 181904761.900000 | 14.579221 | -6.574869 | 6 |
| 254 | 64.480003 | 0.498752 | 1.143421 | 10 | 3 | 31884000 | 437257000 | 3.940000 | 110978934.000000 | 16.365483 | -6.089228 | 11 |
| 255 | 98.820000 | 19.160738 | 1.533003 | 28 | 31 | 625000000 | 1406000000 | 5.180000 | 271428571.400000 | 19.077220 | -0.593158 | 8 |
| 256 | 34.130001 | 3.424245 | 1.109059 | 7 | 22 | -563000000 | 682000000 | 1.010000 | 675247524.800000 | 33.792080 | -2.827111 | 11 |
| 257 | 81.410004 | 6.585503 | 1.227467 | 13 | 117 | 2694000000 | 5642000000 | 12.370000 | 456103476.200000 | 6.581245 | -4.178927 | 4 |
| 258 | 81.800003 | 5.371643 | 1.379589 | 18 | 41 | -2133000000 | 4227000000 | 7.780000 | 543316195.400000 | 10.514139 | 7.029056 | 3 |
| 259 | 20.250000 | -16.632362 | 2.954291 | 10 | 11 | -61744000 | 321824000 | 1.590000 | 202405031.400000 | 12.735849 | 4.291894 | 6 |
| 260 | 102.400002 | 0.293834 | 1.131240 | 35 | 8 | 21000000 | 1547000000 | 5.390000 | 287012987.000000 | 18.998145 | 0.574887 | 8 |
| 261 | 36.200001 | 17.456201 | 1.925754 | 9 | 25 | -808000000 | 1228000000 | 1.000000 | 1228000000.000000 | 36.200001 | 5.434039 | 7 |
| 262 | 56.830002 | -23.244191 | 1.945966 | 15 | 4 | 10853000 | 304768000 | 5.750000 | 53003130.430000 | 9.883479 | -12.020894 | 6 |
| 263 | 101.209999 | 13.425973 | 1.556512 | 8 | 9 | -67676000 | 665783000 | 3.030000 | 219730363.000000 | 33.402640 | -15.727481 | 1 |
| 264 | 542.869995 | 16.995320 | 1.802345 | 17 | 129 | 160383000 | 636056000 | 6.170000 | 103088492.700000 | 87.985412 | 20.409000 | 5 |
| 265 | 47.139999 | -7.659158 | 1.142370 | 36 | 34 | -62542000 | 357796000 | 2.720000 | 131542647.100000 | 17.330882 | 4.089472 | 6 |
| 266 | 189.789993 | 20.432767 | 1.058807 | 13 | 108 | 168081000 | 696067000 | 6.920000 | 100587716.800000 | 27.426300 | -16.215469 | 6 |
| 267 | 24.610001 | -25.106512 | 3.712995 | 26 | 0 | 23000 | -713685000 | -4.290000 | 166360139.900000 | 93.089287 | 0.525090 | 3 |
| 268 | 43.990002 | 6.745943 | 0.839821 | 10 | 2 | -42800000 | 749900000 | 2.140000 | 350420560.700000 | 20.556076 | -2.428225 | 6 |
| 269 | 60.490002 | 7.232764 | 1.266240 | 14 | 9 | 39000000 | 746000000 | 5.220000 | 142911877.400000 | 11.588123 | -4.016461 | 11 |
| 270 | 32.930000 | 15.462833 | 1.456940 | 11 | 99 | 615000000 | 1447000000 | 1.040000 | 1391346154.000000 | 31.663462 | -0.130090 | 4 |
| 271 | 23.940001 | -9.898378 | 2.030786 | 3 | 6 | -2000000 | 196000000 | 0.290000 | 675862069.000000 | 82.551728 | -2.580408 | 3 |
| 272 | 44.599998 | -5.146750 | 1.580117 | 43 | 19 | 73000000 | 225400000 | 1.630000 | 138282208.600000 | 27.361962 | -2.716908 | 8 |
| 273 | 259.600006 | 16.537983 | 1.426488 | 121 | 10 | 165012000 | 1053849000 | 11.380000 | 92605360.280000 | 22.811951 | 2.825366 | 8 |
| 274 | 112.980003 | 4.004424 | 1.091967 | 4 | 47 | -26010000 | 284084000 | 1.020000 | 278513725.500000 | 110.764709 | -3.089477 | 9 |
| 275 | 55.209999 | 12.238260 | 1.773865 | 40 | 23 | -654720000 | 606828000 | 4.680000 | 129664102.600000 | 11.797008 | -7.961579 | 1 |
| 276 | 46.790001 | 4.372070 | 0.895059 | 11 | 15 | 694000000 | 2421000000 | 2.600000 | 931153846.200000 | 17.996154 | -2.795456 | 11 |
| 277 | 194.440002 | 5.284822 | 1.135546 | 48 | 47 | 88852000 | 2139375000 | 5.880000 | 363839285.700000 | 33.068028 | -1.269332 | 9 |
| 278 | 98.580002 | 14.044424 | 1.080858 | 596 | 51 | -1016000000 | 1156000000 | 4.260000 | 271361502.300000 | 23.140846 | -4.178927 | 4 |
| 279 | 120.599998 | -13.906342 | 1.203816 | 10 | 9 | 33398000 | 267046000 | 3.020000 | 88425827.810000 | 39.933774 | -18.864194 | 6 |
| 280 | 94.010002 | -2.791849 | 1.126448 | 11 | 9 | -167000000 | 1350000000 | 5.430000 | 248618784.500000 | 17.313076 | -8.547222 | 11 |
| 281 | 42.840000 | 11.970732 | 1.437938 | 8 | 99 | -2630000000 | 1933000000 | 3.620000 | 533977900.600000 | 11.834254 | -2.481376 | 4 |
| 282 | 66.360001 | -0.866449 | 1.444644 | 9 | 117 | -648000000 | 1980000000 | 4.530000 | 437086092.700000 | 14.649007 | -4.044970 | 4 |
| 283 | 76.830002 | -8.513933 | 2.017394 | 25 | 225 | 237800000 | 798300000 | 4.210000 | 189619952.500000 | 18.249407 | 7.413777 | 7 |
| 284 | 7.110000 | -44.798137 | 4.580042 | 200 | 2 | -38000000 | -4556000000 | -6.070000 | 402141680.400000 | 93.089287 | 1.273530 | 3 |
| 285 | 30.410000 | -2.874478 | 1.835028 | 18 | 99 | 497000000 | 2214000000 | 2.660000 | 832330827.100000 | 11.432331 | -0.842213 | 4 |
| 286 | 92.940002 | -1.650792 | 1.138163 | 17 | 116 | 1584000000 | 1439000000 | 3.820000 | 376701570.700000 | 24.329843 | 7.026782 | 5 |
| 287 | 34.410000 | 5.942118 | 0.859442 | 11 | 11 | -3482000000 | 13345000000 | 2.370000 | 5630801688.000000 | 14.518987 | -23.537323 | 10 |
| 288 | 93.919998 | 13.129368 | 1.217803 | 5 | 35 | -150300000 | 359500000 | 1.940000 | 185309278.400000 | 48.412370 | -25.385129 | 2 |
| 289 | 26.420000 | -8.833678 | 2.730659 | 25 | 88 | 5000000 | -214000000 | -1.530000 | 139869281.000000 | 74.555557 | 4.068084 | 7 |
| 290 | 25.520000 | 13.624226 | 1.797269 | 21 | 21 | 10716000 | 459522000 | 2.040000 | 225255882.400000 | 12.509804 | -12.726553 | 1 |
| 291 | 57.160000 | 1.168142 | 1.022968 | 13 | 99 | -4636000 | 527100000 | 4.210000 | 125201900.200000 | 13.577197 | -1.883912 | 4 |
| 292 | 141.850006 | 15.607180 | 1.247751 | 9 | 11 | -891400000 | 1975400000 | 4.960000 | 398266129.000000 | 28.598792 | -28.032512 | 5 |
| 293 | 85.250000 | 34.803917 | 1.578344 | 14 | 212 | 159000000 | 198000000 | 1.380000 | 143478260.900000 | 61.775362 | 2.627576 | 1 |
| 294 | 112.860001 | 13.029548 | 0.959365 | 15 | 99 | 6000000 | 3439000000 | 10.990000 | 312920837.100000 | 10.269336 | -0.891599 | 4 |
| 295 | 85.500000 | 1.303315 | 1.431109 | 29 | 10 | 12679000 | 410395000 | 3.030000 | 135443894.400000 | 28.217822 | 6.010954 | 1 |
| 296 | 53.330002 | 23.249369 | 1.586719 | 13 | 19 | 250000000 | 1220000000 | 3.280000 | 199237804.900000 | 24.070121 | -1.980483 | 2 |
| 297 | 105.370003 | 8.584091 | 1.854132 | 30 | 37 | -58000000 | 1540000000 | 12.500000 | 123200000.000000 | 8.429600 | 4.594156 | 3 |
| 298 | 49.799999 | 9.234479 | 1.579248 | 20 | 115 | 100145000 | 369041000 | 1.980000 | 186384343.400000 | 25.151515 | -2.347391 | 7 |
| 299 | 54.810001 | 9.971912 | 1.263479 | 30 | 126 | -199000000 | 2986000000 | 2.860000 | 1044055944.000000 | 19.164336 | 2.768051 | 7 |
| 300 | 80.610001 | -16.948277 | 1.758824 | 14 | 27 | -463323000 | 232573000 | 3.030000 | 213598256.500000 | 20.819876 | -0.857290 | 1 |
| 301 | 57.299999 | 8.215294 | 1.747606 | 82 | 42 | 1004000000 | 7340000000 | 19.520000 | 376024590.200000 | 2.935451 | -23.195292 | 6 |
| 302 | 37.570000 | 8.583821 | 1.157906 | 12 | 47 | -8482000 | 340383000 | 1.300000 | 261833076.900000 | 28.900000 | -3.089477 | 9 |
| 303 | 119.489998 | -5.136552 | 2.048697 | 16 | 6 | 29159000 | 680528000 | 6.890000 | 98770391.870000 | 17.342525 | 6.255903 | 5 |
| 304 | 117.639999 | 1.466273 | 1.482349 | 17 | 70 | 3428000000 | 5813000000 | 6.100000 | 952950819.700000 | 19.285246 | -8.805281 | 5 |
| 305 | 33.290001 | 3.804181 | 1.102848 | 10 | 117 | 10400000 | 867100000 | 3.510000 | 247037037.000000 | 9.484331 | -4.178927 | 4 |
| 306 | 78.199997 | -12.371135 | 1.430297 | 23 | 43 | -195000000 | 4772000000 | 5.510000 | 866061706.000000 | 14.192377 | 1.066899 | 6 |
| 307 | 96.230003 | -2.797977 | 0.826408 | 196 | 44 | 439000000 | 4844000000 | 5.380000 | 900371747.200000 | 17.886618 | 1.069558 | 6 |
| 308 | 96.070000 | 8.065239 | 0.949396 | 28 | 31 | 1630000000 | 7608000000 | 8.720000 | 872477064.200000 | 11.017202 | -13.198055 | 6 |
| 309 | 80.800003 | 9.248248 | 1.034843 | 24 | 61 | -3800000 | 411500000 | 4.130000 | 99636803.870000 | 19.564165 | 9.471400 | 5 |
| 310 | 70.709999 | 17.341522 | 1.626934 | 19 | 56 | 425000000 | 3990000000 | 8.000000 | 498750000.000000 | 8.838750 | 15.262155 | 3 |
| 311 | 94.970001 | 6.028805 | 1.845710 | 5 | 81 | 142787000 | 221177000 | 1.660000 | 133239156.600000 | 57.210844 | -0.266191 | 8 |
| 312 | 99.959999 | 10.027519 | 1.019724 | 11 | 47 | 637230000 | 760434000 | 3.610000 | 210646537.400000 | 27.689750 | -1.081912 | 9 |
| 313 | 76.879997 | -1.448539 | 1.454019 | 37 | 10 | 98989000 | 507577000 | 3.070000 | 165334527.700000 | 25.042344 | -13.617440 | 6 |
| 314 | 87.360001 | 23.459580 | 1.379480 | 35 | 127 | 37051000 | 375236000 | 3.290000 | 114053495.400000 | 26.553192 | 4.076543 | 7 |
| 315 | 125.830002 | 21.928300 | 2.456535 | 59 | 221 | 89509000 | -556334000 | -2.310000 | 240837229.400000 | 39.602928 | 2.559671 | 5 |
| 316 | 56.430000 | 0.213104 | 1.444924 | 4 | 47 | -1803000 | 419222000 | 1.260000 | 332715873.000000 | 44.785714 | -4.040751 | 9 |
| 317 | 46.220001 | 6.277303 | 0.842592 | 109 | 14 | -6128000000 | 17879000000 | 4.380000 | 4081963470.000000 | 10.552512 | -26.380687 | 10 |
| 318 | 134.580002 | 13.925341 | 1.044615 | 23 | 425 | 65488000 | 469053000 | 5.700000 | 82290000.000000 | 23.610527 | 29.547150 | 5 |
| 319 | 51.310001 | -1.986623 | 1.103033 | 7 | 2 | -12100000 | 640300000 | 2.360000 | 271313559.300000 | 21.741526 | -1.850995 | 11 |
| 320 | 54.360001 | 5.532912 | 0.969774 | 12 | 99 | -460000000 | 22894000000 | 4.180000 | 5477033493.000000 | 13.004785 | -0.938007 | 4 |
| 321 | 146.869995 | -0.230971 | 2.397803 | 17 | 10 | -254000000 | 783000000 | 9.950000 | 78693467.340000 | 14.760804 | -45.086335 | 1 |
| 322 | 53.369999 | 7.061186 | 0.940366 | 14 | 2 | -1268000000 | 753000000 | 1.660000 | 453614457.800000 | 32.150602 | -1.415299 | 6 |
| 323 | 25.700001 | -30.988186 | 3.719560 | 9 | 4 | -140000000 | -571000000 | -0.760000 | 751315789.500000 | 93.089287 | -14.561121 | 3 |
| 324 | 17.910000 | -2.610109 | 1.273051 | 60 | 16 | -467300000 | 837800000 | 1.630000 | 513987730.100000 | 10.987730 | -8.043772 | 7 |
| 325 | 29.980000 | 8.544529 | 1.338067 | 10 | 116 | -568000000 | 506000000 | 0.890000 | 568539325.800000 | 33.685393 | 2.284802 | 9 |
| 326 | 72.650002 | 1.000976 | 1.331918 | 64 | 9 | -12000000 | 612000000 | 5.180000 | 118146718.100000 | 14.025097 | -10.249967 | 1 |
| 327 | 69.190002 | 29.496541 | 3.794783 | 174 | 198 | -102075000 | 195290000 | 1.930000 | 101186528.500000 | 35.849742 | 12.695712 | 1 |
| 328 | 89.379997 | -14.403372 | 2.397940 | 86 | 190 | 373520000 | -2408948000 | -25.920000 | 92937808.640000 | 93.089287 | 7.186128 | 3 |
| 329 | 35.910000 | 1.383405 | 1.015052 | 9 | 2 | 5332000 | 984485000 | 1.940000 | 507466494.800000 | 18.510309 | -2.261927 | 11 |
| 330 | 39.180000 | 7.696534 | 0.991011 | 10 | 99 | 734422000 | 1201560000 | 4.220000 | 284729857.800000 | 9.284360 | -7.762677 | 4 |
| 331 | 77.949997 | 3.656915 | 1.370062 | 9 | 7 | -911000000 | 16150000000 | 3.850000 | 4194805195.000000 | 20.246752 | -2.706443 | 3 |
| 332 | 60.849998 | 19.901474 | 1.007230 | 11 | 60 | 133000000 | 251200000 | 1.790000 | 140335195.500000 | 33.994412 | 0.855096 | 5 |
| 333 | 10.630000 | 9.474768 | 1.866680 | 5 | 26 | -43000000 | 474000000 | 0.420000 | 1128571429.000000 | 25.309524 | -0.295949 | 7 |
| 334 | 36.500000 | 11.009729 | 1.166311 | 16 | 83 | 17000000 | 340000000 | 1.880000 | 180851063.800000 | 19.414894 | 4.130471 | 6 |
| 335 | 33.259998 | 14.887727 | 1.845149 | 15 | 459 | -1032187000 | -4359082000 | -4.640000 | 939457327.600000 | 28.976191 | 6.261775 | 7 |
| 336 | 52.516175 | -8.698917 | 1.478877 | 142 | 27 | 159000000 | 1293000000 | 2.970000 | 435353535.400000 | 17.682214 | -3.838260 | 1 |
| 337 | 102.589996 | 9.347683 | 1.404206 | 1 | 100 | 376000000 | 147000000 | 0.780000 | 188461538.500000 | 131.525636 | -23.884449 | 5 |
| 338 | 27.299999 | -1.158588 | 1.468176 | 4 | 99 | -43623000 | 309471000 | 1.200000 | 257892500.000000 | 22.749999 | -0.063096 | 4 |
| 339 | 47.919998 | 16.678836 | 1.610285 | 32 | 65 | 272000000 | 339000000 | 0.680000 | 498529411.800000 | 70.470585 | 1.723068 | 5 |
# Plot correlation heatmap
plt.figure(figsize=(15, 7))
sns.heatmap(df_eda_corr.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()
Let's look at all the numerical variables with regard to sector.
To do this, we'll want to first normalize the scale of the various numerical series so we can get an accurate picture of how they vary by sector.
df_eda.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 340 entries, 0 to 339 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 GICS Sector 340 non-null object 1 GICS Sub Industry 340 non-null object 2 Current Price 340 non-null float64 3 Price Change 340 non-null float64 4 Volatility 340 non-null float64 5 ROE 340 non-null int64 6 Cash Ratio 340 non-null int64 7 Net Cash Flow 340 non-null int64 8 Net Income 340 non-null int64 9 Earnings Per Share 340 non-null float64 10 Estimated Shares Outstanding 340 non-null float64 11 P/E Ratio 340 non-null float64 12 P/B Ratio 340 non-null float64 13 GICS Sector Int 340 non-null int64 dtypes: float64(7), int64(5), object(2) memory usage: 37.3+ KB
# Normalize the data
categorical_df_eda = df_eda[['GICS Sector', 'GICS Sector Int']] # we want to preserve the integer without scaling
numerical_df_eda = df_eda.drop(columns=['GICS Sector', 'GICS Sub Industry', 'GICS Sector Int'])
# Scale it using min-max method (vs. Z-score), since we want to preserve the relationships between data points
min_max_scaler = MinMaxScaler()
normalized_numerical_data = min_max_scaler.fit_transform(numerical_df_eda)
# Create dataframe for numerical correlation heatmap
normalized_df_eda = pd.DataFrame(normalized_numerical_data, columns=numerical_df_eda.columns)
# Combine back with sector for pairplot that recombines friendly Sector names
final_normalized_df_eda = pd.concat([categorical_df_eda, normalized_df_eda], axis=1)
final_normalized_df_eda.head(5)
| GICS Sector | GICS Sector Int | Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Industrials | 6 | 0.029793 | 0.559101 | 0.247990 | 0.146288 | 0.053236 | 0.331665 | 0.649114 | 0.652260 | 0.104452 | 0.001491 | 0.328169 |
| 1 | Health Care | 5 | 0.043087 | 0.542850 | 0.380757 | 0.140830 | 0.080376 | 0.352152 | 0.597707 | 0.578219 | 0.261814 | 0.030224 | 0.328335 |
| 2 | Health Care | 5 | 0.031808 | 0.571834 | 0.140499 | 0.021834 | 0.069937 | 0.379895 | 0.582677 | 0.576332 | 0.240842 | 0.023500 | 0.369059 |
| 3 | Information Technology | 7 | 0.070400 | 0.598024 | 0.162344 | 0.008734 | 0.187891 | 0.343024 | 0.503597 | 0.561236 | 0.076973 | 0.136392 | 0.391448 |
| 4 | Information Technology | 7 | 0.040002 | 0.443347 | 0.251634 | 0.014192 | 0.283925 | 0.360413 | 0.505001 | 0.552700 | 0.362110 | 0.334251 | 0.376145 |
#Re-plot the correlation heatmap using scaled data
# Convert scaled Numpy array results to dataframe
normalized_numerical_data_df = pd.DataFrame(normalized_numerical_data, columns=numerical_df_eda.columns)
# Plot new results
plt.figure(figsize=(15, 7))
sns.heatmap(normalized_numerical_data_df.corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.show()
OBSERVATIONS
Our revised correlation matrix using scaled data modified the correlations considerably:
Positive correlations:
Negative correlations:
# Create pairplot, noting the sector values
sns.pairplot(df_eda_corr, diag_kind='kde')
<seaborn.axisgrid.PairGrid at 0x795d44836860>
OBSERVATIONS
The pairplot above clearly shows us that sector is an important feature with data clustered into multiple peaks, even though this isn't apparent in our correlation heatmap.
Price Change and Net Income are fairly normally distributed
P/E Ratio appears to be bi-modal.
GICS Sector (converted to an integer) appears to be tri-modal.
As noted earlier, many of the remaining features exhibit skewness (primary right-skewed).
Since we have a major categorical variable (Sector), let's next compare each numerical series to it.
# Show sector vs. other numeric series using normalized dataframe
for feature in list_numcol:
plt.figure(figsize=(12, 5))
sns.boxplot(data=final_normalized_df_eda, x='GICS Sector', y=feature, hue='GICS Sector')
plt.xticks(rotation=90)
plt.show()
OBSERVATIONS
For Sector vs. other variables, we can note a few apparent impacts. Energy in particular seems to have the widest range of values for numerous series.
Current Price and ROE seem to vary widely by sector.
Other series vary moderately by sector. Energy and Telecommunications in particular exhibit wide ranges.
# Create pivot table for price by sector (counts and averages)
df_eda_sector_pivot = df_eda.pivot_table(
index='GICS Sector',
values='Current Price',
aggfunc=['count', 'mean'] # Calculate both count and mean
)
# Rename the columns for clarity
df_eda_sector_pivot.columns = ['Count', 'Average Current Price']
df_eda_sector_pivot = df_eda_sector_pivot.reset_index()
# Calculate percentages relative to the total count, excluding the total row
total_counts = df_eda_sector_pivot['Count'].sum()
df_eda_sector_pivot['Percentage of Total'] = (df_eda_sector_pivot['Count'] / total_counts) * 100
# Sort the DataFrame by counts in descending order
df_eda_sector_pivot.sort_values(by='Average Current Price', ascending=False, inplace=True)
# Display the combined DataFrame
display(df_eda_sector_pivot)
| GICS Sector | Count | Average Current Price | Percentage of Total | |
|---|---|---|---|---|
| 4 | Health Care | 40 | 132.048250 | 11.764706 |
| 0 | Consumer Discretionary | 40 | 128.095404 | 11.764706 |
| 8 | Real Estate | 27 | 90.976925 | 7.941176 |
| 7 | Materials | 20 | 76.551501 | 5.882353 |
| 5 | Industrials | 53 | 74.411807 | 15.588235 |
| 1 | Consumer Staples | 19 | 71.972807 | 5.588235 |
| 6 | Information Technology | 33 | 63.548485 | 9.705882 |
| 3 | Financials | 49 | 58.659183 | 14.411765 |
| 10 | Utilities | 24 | 52.968542 | 7.058824 |
| 2 | Energy | 30 | 46.042334 | 8.823529 |
| 9 | Telecommunications Services | 5 | 32.964000 | 1.470588 |
Answer
As a whole, the range of Current Price is very wide, and it has a distinctly right-skewed distribution, with a maximum price of 1274.95 and minimum price of 4.50. Half of the stock prices fall below 80.86.
Looking by sector, the highest average prices belong to the Health Care and Consumer Discretionary sectors, both of which comprise about 11.8% each of our dataset and have prices averaging around 130.
The lowest current prices belong to Telecommunications (1.5% of our dataset, at about 32.96) and Energy (8.8%, at 45.04).
Consumer discretionary has the highest outliers of any sector.
Earlier, we found that the average price change was just over $4.
To answer this question, let's further compare the price changes by sector.
# Show prior dataframe
df_stats[df_stats['Feature'] == 'Price Change']
| Feature | Mean | Median | Min | Max | Range | Standard Deviation | Variance | P-value | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Price Change | 4.078194 | 4.819505 | -47.129693 | 55.051683 | 102.181377 | 12.006338 | 144.152149 | 0.000000 |
# Create pivot table for price change by sector
df_eda_sector_price_change_pivot = df_eda.pivot_table(
index='GICS Sector',
values='Price Change',
aggfunc=['mean', 'min', 'max']
)
df_eda_sector_price_change_pivot['Range'] = df_eda_sector_price_change_pivot['max'] - df_eda_sector_price_change_pivot['min']
df_eda_sector_price_change_pivot.columns = ['Mean Price Change', 'Min', 'Max', 'Range']
df_eda_sector_price_change_pivot.sort_values(by='Mean Price Change', ascending=False, inplace=True)
df_eda_sector_price_change_pivot = df_eda_sector_price_change_pivot.reset_index()
df_eda_sector_price_change_pivot
| GICS Sector | Mean Price Change | Min | Max | Range | |
|---|---|---|---|---|---|
| 0 | Health Care | 9.585652 | -12.532337 | 33.177346 | 45.709684 |
| 1 | Consumer Staples | 8.684750 | -12.017268 | 24.496225 | 36.513493 |
| 2 | Information Technology | 7.217476 | -23.790903 | 55.051683 | 78.842586 |
| 3 | Telecommunications Services | 6.956980 | -2.301255 | 24.707504 | 27.008759 |
| 4 | Real Estate | 6.205548 | -13.067267 | 15.573900 | 28.641168 |
| 5 | Consumer Discretionary | 5.846093 | -33.131268 | 34.803917 | 67.935185 |
| 6 | Materials | 5.589738 | -31.685167 | 37.489677 | 69.174843 |
| 7 | Financials | 3.865406 | -14.292764 | 15.462833 | 29.755597 |
| 8 | Industrials | 2.833127 | -23.244191 | 20.432767 | 43.676958 |
| 9 | Utilities | 0.803657 | -8.230553 | 8.596874 | 16.827427 |
| 10 | Energy | -10.228289 | -47.129693 | 17.341522 | 64.471216 |
# Visualize price change
plt.figure(figsize=(10, 5))
sns.barplot(data=df_eda.sort_values(by='Price Change', ascending=False), x='GICS Sector', y='Price Change', hue='GICS Sector')
plt.xticks(rotation=90)
plt.show()
# Visualize volatility
plt.figure(figsize=(10, 5))
sns.barplot(data=df_eda.sort_values(by='Volatility', ascending=False), x='GICS Sector', y='Volatility', hue='GICS Sector')
plt.xticks(rotation=90)
plt.show()
OBSERVATIONS
Health Care has the highest average price change out of all sectors. The top three sectors with the highest average price change are:
All sectors except the four listed below exceed the average price change of \$4.09:
Energy is the only sector with an average negative price change, and also exhibits the second-highest range of values. It also has the most volatility in price.
Next, let's drill down by sector into scaled correlation data to see if the relationships differ widely by sector.
rank_mapping
{'Consumer Discretionary': 1,
'Consumer Staples': 2,
'Energy': 3,
'Financials': 4,
'Health Care': 5,
'Industrials': 6,
'Information Technology': 7,
'Materials': 8,
'Real Estate': 9,
'Telecommunications Services': 10,
'Utilities': 11}
final_normalized_df_eda
| GICS Sector | GICS Sector Int | Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Industrials | 6 | 0.029793 | 0.559101 | 0.247990 | 0.146288 | 0.053236 | 0.331665 | 0.649114 | 0.652260 | 0.104452 | 0.001491 | 0.328169 |
| 1 | Health Care | 5 | 0.043087 | 0.542850 | 0.380757 | 0.140830 | 0.080376 | 0.352152 | 0.597707 | 0.578219 | 0.261814 | 0.030224 | 0.328335 |
| 2 | Health Care | 5 | 0.031808 | 0.571834 | 0.140499 | 0.021834 | 0.069937 | 0.379895 | 0.582677 | 0.576332 | 0.240842 | 0.023500 | 0.369059 |
| 3 | Information Technology | 7 | 0.070400 | 0.598024 | 0.162344 | 0.008734 | 0.187891 | 0.343024 | 0.503597 | 0.561236 | 0.076973 | 0.136392 | 0.391448 |
| 4 | Information Technology | 7 | 0.040002 | 0.443347 | 0.251634 | 0.014192 | 0.283925 | 0.360413 | 0.505001 | 0.552700 | 0.362110 | 0.334251 | 0.376145 |
| 5 | Consumer Staples | 2 | 0.025330 | 0.343628 | 0.203627 | 0.009825 | 0.051148 | 0.344645 | 0.529018 | 0.576781 | 0.096340 | 0.017772 | 0.407517 |
| 6 | Information Technology | 7 | 0.214152 | 0.521807 | 0.099773 | 0.031659 | 0.026096 | 0.353399 | 0.502909 | 0.629976 | 0.006406 | 0.053523 | 1.000000 |
| 7 | Utilities | 11 | 0.030485 | 0.482516 | 0.101647 | 0.008734 | 0.014614 | 0.359533 | 0.503731 | 0.573277 | 0.035381 | 0.026074 | 0.367474 |
| 8 | Utilities | 11 | 0.042324 | 0.484447 | 0.087167 | 0.010917 | 0.009395 | 0.350991 | 0.533256 | 0.578039 | 0.064294 | 0.029558 | 0.356249 |
| 9 | Financials | 4 | 0.043607 | 0.490861 | 0.081919 | 0.014192 | 0.103340 | 0.340923 | 0.543277 | 0.602750 | 0.065743 | 0.013810 | 0.361799 |
| 10 | Financials | 4 | 0.045236 | 0.543032 | 0.097170 | 0.001092 | 0.103340 | 0.346522 | 0.536252 | 0.565100 | 0.207406 | 0.064241 | 0.349891 |
| 11 | Real Estate | 9 | 0.027966 | 0.535404 | 0.111823 | 0.015284 | 0.049061 | 0.351239 | 0.495658 | 0.563573 | 0.022172 | 0.044563 | 0.364794 |
| 12 | Financials | 4 | 0.059853 | 0.479808 | 0.098636 | 0.002183 | 0.103340 | 0.349607 | 0.493424 | 0.568605 | 0.006586 | 0.068150 | 0.351132 |
| 13 | Financials | 4 | 0.028683 | 0.455296 | 0.082895 | 0.009825 | 0.103340 | 0.355749 | 0.497911 | 0.568515 | 0.023598 | 0.032074 | 0.322947 |
| 14 | Information Technology | 7 | 0.037884 | 0.228406 | 0.169316 | 0.009825 | 0.234864 | 0.352146 | 0.497173 | 0.566089 | 0.024608 | 0.050092 | 0.391851 |
| 15 | Materials | 8 | 0.040545 | 0.720206 | 0.322641 | 0.009825 | 0.013570 | 0.279368 | 0.497455 | 0.576961 | 0.013633 | 0.029847 | 0.304456 |
| 16 | Industrials | 6 | 0.059829 | 0.481459 | 0.270419 | 0.037118 | 0.077244 | 0.349493 | 0.508151 | 0.609309 | 0.016410 | 0.017605 | 0.365548 |
| 17 | Financials | 4 | 0.045330 | 0.525751 | 0.083211 | 0.010917 | 0.103340 | 0.345490 | 0.535731 | 0.595921 | 0.064641 | 0.017504 | 0.349891 |
| 18 | Industrials | 6 | 0.048345 | 0.595832 | 0.143137 | 0.655022 | 0.046973 | 0.347717 | 0.493681 | 0.564291 | 0.011174 | 0.072870 | 0.366704 |
| 19 | Health Care | 5 | 0.146602 | 0.679851 | 0.335274 | 0.001092 | 0.203549 | 0.352621 | 0.493475 | 0.556025 | 0.030023 | 0.528618 | 0.301913 |
| 20 | Information Technology | 7 | 0.011154 | 0.723849 | 0.188950 | 0.018559 | 0.136743 | 0.406700 | 0.519179 | 0.560068 | 0.194225 | 0.025874 | 0.389782 |
| 21 | Industrials | 6 | 0.038640 | 0.482888 | 0.092569 | 0.018559 | 0.038622 | 0.350663 | 0.502790 | 0.572019 | 0.034659 | 0.035896 | 0.349096 |
| 22 | Financials | 4 | 0.122209 | 0.396534 | 0.353508 | 0.018559 | 0.069937 | 0.350970 | 0.501230 | 0.635187 | 0.004355 | 0.026469 | 0.219821 |
| 23 | Health Care | 5 | 0.124232 | 0.629206 | 0.233201 | 0.026201 | 0.377871 | 0.363474 | 0.635126 | 0.632132 | 0.119167 | 0.028195 | 0.488009 |
| 24 | Financials | 4 | 0.080224 | 0.437546 | 0.127141 | 0.022926 | 0.069937 | 0.341768 | 0.523035 | 0.627190 | 0.025108 | 0.017975 | 0.305681 |
| 25 | Real Estate | 9 | 0.072769 | 0.561382 | 0.112465 | 0.009825 | 0.040710 | 0.350782 | 0.504755 | 0.562674 | 0.074169 | 0.124431 | 0.269541 |
| 26 | Consumer Discretionary | 1 | 0.528466 | 0.777028 | 0.189042 | 0.003275 | 0.060543 | 0.392249 | 0.502898 | 0.561416 | 0.071425 | 1.000000 | 0.390009 |
| 27 | Consumer Discretionary | 1 | 0.043418 | 0.484237 | 0.194379 | 0.019651 | 0.001044 | 0.350516 | 0.499700 | 0.585228 | 0.013854 | 0.023320 | 0.332136 |
| 28 | Health Care | 5 | 0.106214 | 0.455167 | 0.202370 | 0.010917 | 0.073069 | 0.349362 | 0.543840 | 0.637344 | 0.038396 | 0.021701 | 0.219863 |
| 29 | Financials | 4 | 0.069039 | 0.499504 | 0.096668 | 0.024017 | 0.103340 | 0.350870 | 0.519345 | 0.594213 | 0.041304 | 0.030029 | 0.333161 |
| 30 | Energy | 3 | 0.031461 | 0.572780 | 0.434702 | 1.000000 | 0.083507 | 0.372388 | 0.000000 | 0.000000 | 0.058186 | 0.171688 | 0.395206 |
| 31 | Energy | 3 | 0.034696 | 0.257656 | 0.442437 | 0.055677 | 0.022965 | 0.149443 | 0.350969 | 0.431485 | 0.078294 | 0.171688 | 0.308300 |
| 32 | Information Technology | 7 | 0.037569 | 0.487597 | 0.071382 | 0.025109 | 0.182672 | 0.374587 | 0.506389 | 0.572109 | 0.045899 | 0.034679 | 0.410959 |
| 33 | Industrials | 6 | 0.002282 | 0.477362 | 0.483223 | 0.002183 | 0.038622 | 0.351870 | 0.483761 | 0.547129 | 0.164889 | 0.029998 | 0.383846 |
| 34 | Information Technology | 7 | 0.026927 | 0.689453 | 0.299768 | 0.010917 | 0.073069 | 0.255943 | 0.509068 | 0.560787 | 0.115715 | 0.055334 | 0.372395 |
| 35 | Real Estate | 9 | 0.141391 | 0.508775 | 0.103905 | 0.007642 | 0.049061 | 0.347149 | 0.505936 | 0.599694 | 0.017322 | 0.057705 | 0.355923 |
| 36 | Information Technology | 7 | 0.110709 | 0.636441 | 0.289590 | 0.030568 | 0.170146 | 0.357375 | 0.518908 | 0.575613 | 0.165761 | 0.029164 | 0.390255 |
| 37 | Utilities | 11 | 0.043489 | 0.545369 | 0.113953 | 0.008734 | 0.004175 | 0.351245 | 0.500396 | 0.573816 | 0.024671 | 0.037187 | 0.347122 |
| 38 | Financials | 4 | 0.051202 | 0.400400 | 0.043387 | 0.026201 | 0.103340 | 0.365382 | 0.598103 | 0.584958 | 0.078118 | 0.013955 | 0.368012 |
| 39 | Industrials | 6 | 0.110268 | 0.560129 | 0.109892 | 0.088428 | 0.025052 | 0.337076 | 0.598374 | 0.617486 | 0.107741 | 0.031026 | 0.478360 |
| 40 | Financials | 4 | 0.009705 | 0.543841 | 0.178203 | 0.005459 | 0.103340 | 1.000000 | 0.821680 | 0.587474 | 0.133309 | 0.019176 | 0.366409 |
| 41 | Health Care | 5 | 0.026487 | 0.624694 | 0.122531 | 0.010917 | 0.133612 | 0.328287 | 0.510652 | 0.565909 | 0.084178 | 0.035226 | 0.413074 |
| 42 | Financials | 4 | 0.026219 | 0.519368 | 0.089557 | 0.007642 | 0.103340 | 0.393907 | 0.533917 | 0.573187 | 0.126714 | 0.022211 | 0.366825 |
| 43 | Health Care | 5 | 0.145570 | 0.476397 | 0.171898 | 0.008734 | 0.085595 | 0.350256 | 0.493296 | 0.566089 | 0.007755 | 0.194836 | 0.350863 |
| 44 | Energy | 3 | 0.032784 | 0.340740 | 0.474772 | 0.012009 | 0.087683 | 0.368823 | 0.449468 | 0.509570 | 0.066934 | 0.171688 | 0.436729 |
| 45 | Health Care | 5 | 0.237593 | 0.509366 | 0.284082 | 0.040393 | 0.138831 | 0.355214 | 0.564415 | 0.688112 | 0.033099 | 0.032343 | 0.371773 |
| 46 | Financials | 4 | 0.028903 | 0.514298 | 0.121786 | 0.007642 | 0.103340 | 0.337014 | 0.556306 | 0.574445 | 0.184145 | 0.023164 | 0.354793 |
| 47 | Materials | 8 | 0.053705 | 0.623064 | 0.169883 | 0.022926 | 0.010438 | 0.351576 | 0.496329 | 0.568335 | 0.017834 | 0.061974 | 0.351994 |
| 48 | Health Care | 5 | 0.050604 | 0.618619 | 0.199047 | 0.010917 | 0.055324 | 0.250907 | 0.523098 | 0.558361 | 0.267013 | 0.133774 | 0.373846 |
| 49 | Health Care | 5 | 0.010972 | 0.576301 | 0.197199 | 0.003275 | 0.013570 | 0.342174 | 0.485491 | 0.548297 | 0.212033 | 0.054339 | 0.352066 |
| 50 | Consumer Discretionary | 1 | 0.030485 | 0.495200 | 0.344592 | 0.017467 | 0.026096 | 0.343673 | 0.503183 | 0.574355 | 0.032044 | 0.024677 | 0.376072 |
| 51 | Real Estate | 9 | 0.096848 | 0.531733 | 0.092622 | 0.009825 | 0.049061 | 0.318048 | 0.502629 | 0.583970 | 0.020579 | 0.058496 | 0.364794 |
| 52 | Financials | 4 | 0.037192 | 0.507376 | 0.137468 | 0.007642 | 0.103340 | 0.000000 | 0.849906 | 0.598526 | 0.515262 | 0.012626 | 0.365290 |
| 53 | Industrials | 6 | 0.049951 | 0.495980 | 0.197664 | 0.017467 | 0.026096 | 0.323001 | 0.542839 | 0.581723 | 0.111216 | 0.030970 | 0.401509 |
| 54 | Financials | 4 | 0.088433 | 0.590357 | 0.055027 | 0.009825 | 0.103340 | 0.385587 | 0.549552 | 0.628179 | 0.048552 | 0.019958 | 0.285856 |
| 55 | Real Estate | 9 | 0.023677 | 0.541463 | 0.146793 | 0.020742 | 0.012526 | 0.344286 | 0.501879 | 0.564651 | 0.049896 | 0.034564 | 0.354335 |
| 56 | Real Estate | 9 | 0.064505 | 0.554884 | 0.059016 | 0.021834 | 0.037578 | 0.350657 | 0.522180 | 0.589810 | 0.051356 | 0.031490 | 0.318994 |
| 57 | Consumer Discretionary | 1 | 0.039340 | 0.548710 | 0.159630 | 0.006550 | 0.020877 | 0.383836 | 0.527100 | 0.570222 | 0.122278 | 0.040317 | 0.334539 |
| 58 | Health Care | 5 | 0.090724 | 0.543920 | 0.329531 | 0.028384 | 0.347599 | 0.374287 | 0.523869 | 0.568065 | 0.124828 | 0.107315 | 0.349926 |
| 59 | Materials | 8 | 0.028580 | 0.370704 | 0.425026 | 0.016376 | 0.026096 | 0.297054 | 0.504334 | 0.576602 | 0.031998 | 0.020577 | 0.369062 |
| 60 | Financials | 4 | 0.017073 | 0.561779 | 0.118557 | 0.003275 | 0.103340 | 0.344583 | 0.507984 | 0.563842 | 0.083871 | 0.026588 | 0.370673 |
| 61 | Consumer Staples | 2 | 0.029863 | 0.471488 | 0.050915 | 0.020742 | 0.039666 | 0.347648 | 0.499029 | 0.578039 | 0.016871 | 0.020232 | 0.325030 |
| 62 | Energy | 3 | 0.000000 | 0.088352 | 0.994742 | 0.748908 | 0.022965 | 0.247873 | 0.184344 | 0.348369 | 0.102262 | 0.048509 | 0.362010 |
| 63 | Industrials | 6 | 0.045275 | 0.373077 | 0.117578 | 0.046943 | 0.012526 | 0.351786 | 0.501099 | 0.581544 | 0.019102 | 0.027964 | 0.376428 |
| 64 | Consumer Discretionary | 1 | 0.140580 | 0.496453 | 0.250795 | 0.641921 | 0.000000 | 0.350619 | 0.484824 | 0.528080 | 0.013675 | 0.034059 | 0.000000 |
| 65 | Health Care | 5 | 0.111638 | 0.546206 | 0.222319 | 0.017467 | 0.073069 | 0.367697 | 0.534125 | 0.623326 | 0.037287 | 0.028519 | 0.328066 |
| 66 | Financials | 4 | 0.043032 | 0.556922 | 0.052679 | 0.009825 | 0.103340 | 0.349087 | 0.503690 | 0.584689 | 0.022205 | 0.023527 | 0.349891 |
| 67 | Consumer Staples | 2 | 0.048896 | 0.508029 | 0.042192 | 0.504367 | 0.028184 | 0.346835 | 0.519325 | 0.563663 | 0.143013 | 0.077331 | 0.368308 |
| 68 | Financials | 4 | 0.029383 | 0.479831 | 0.214327 | 0.006550 | 0.103340 | 0.352934 | 0.501334 | 0.576242 | 0.024487 | 0.021598 | 0.370673 |
| 69 | Financials | 4 | 0.067771 | 0.437726 | 0.153419 | 0.005459 | 0.103340 | 0.360769 | 0.516469 | 0.583251 | 0.050304 | 0.040916 | 0.085141 |
| 70 | Consumer Discretionary | 1 | 0.374159 | 0.136996 | 0.452533 | 0.022926 | 0.247390 | 0.345194 | 0.500388 | 0.687393 | 0.000557 | 0.054137 | 0.454814 |
| 71 | Industrials | 6 | 0.065733 | 0.276419 | 0.192156 | 0.019651 | 0.050104 | 0.332103 | 0.519637 | 0.620541 | 0.024515 | 0.015734 | 0.475469 |
| 72 | Utilities | 11 | 0.024857 | 0.480317 | 0.079202 | 0.014192 | 0.012526 | 0.351777 | 0.501668 | 0.566987 | 0.041581 | 0.030573 | 0.371291 |
| 73 | Health Care | 5 | 0.048258 | 0.673726 | 0.406962 | 0.016376 | 0.073069 | 0.355248 | 0.497874 | 0.576781 | 0.014850 | 0.036325 | 0.364618 |
| 74 | Utilities | 11 | 0.010910 | 0.475294 | 0.170711 | 0.020742 | 0.044885 | 0.349493 | 0.476048 | 0.535448 | 0.065585 | 0.027381 | 0.373497 |
| 75 | Financials | 4 | 0.053272 | 0.455172 | 0.164141 | 0.008734 | 0.103340 | 0.374984 | 0.574901 | 0.614161 | 0.087866 | 0.013635 | 0.367444 |
| 76 | Energy | 3 | 0.010382 | 0.264533 | 0.603776 | 0.005459 | 0.000000 | 0.349917 | 0.488099 | 0.547399 | 0.061824 | 0.171688 | 0.369893 |
| 77 | Health Care | 5 | 0.102090 | 0.366578 | 0.213652 | 0.007642 | 0.003132 | 0.350282 | 0.494716 | 0.565999 | 0.018374 | 0.059148 | 0.375148 |
| 78 | Industrials | 6 | 0.016884 | 0.418670 | 0.232151 | 0.017467 | 0.077244 | 0.349274 | 0.531499 | 0.567886 | 0.155967 | 0.019119 | 0.375378 |
| 79 | Telecommunications Services | 10 | 0.016262 | 0.462794 | 0.205109 | 0.005459 | 0.003132 | 0.350494 | 0.508776 | 0.564112 | 0.086115 | 0.024735 | 0.305755 |
| 80 | Information Technology | 7 | 0.043701 | 0.415684 | 0.157260 | 0.017467 | 0.189979 | 0.354157 | 0.524319 | 0.573906 | 0.094660 | 0.037219 | 0.405689 |
| 81 | Information Technology | 7 | 0.056004 | 0.549525 | 0.321222 | 0.016376 | 0.054280 | 0.353946 | 0.497131 | 0.567976 | 0.021400 | 0.066085 | 0.362378 |
| 82 | Consumer Staples | 2 | 0.073415 | 0.474218 | 0.196056 | 0.014192 | 0.011482 | 0.349869 | 0.599646 | 0.591787 | 0.178770 | 0.034365 | 0.336768 |
| 83 | Energy | 3 | 0.067268 | 0.586943 | 0.264498 | 0.002183 | 0.044885 | 0.295415 | 0.586095 | 0.572019 | 0.299588 | 0.064051 | 0.394198 |
| 84 | Energy | 3 | 0.069550 | 0.434381 | 0.509344 | 0.000000 | 0.039666 | 0.357705 | 0.491847 | 0.554767 | 0.015390 | 0.321894 | 0.398633 |
| 85 | Utilities | 11 | 0.049699 | 0.422201 | 0.040752 | 0.015284 | 0.008351 | 0.359596 | 0.530060 | 0.578758 | 0.091969 | 0.034538 | 0.333916 |
| 86 | Industrials | 6 | 0.036357 | 0.592131 | 0.184840 | 0.044760 | 0.032359 | 0.346929 | 0.584824 | 0.600952 | 0.125441 | 0.011405 | 0.289443 |
| 87 | Materials | 8 | 0.048880 | 0.828129 | 0.219585 | 0.020742 | 0.062630 | 0.300200 | 0.531186 | 0.569413 | 0.142267 | 0.052858 | 0.386831 |
| 88 | Industrials | 6 | 0.056492 | 0.499917 | 0.212843 | 0.030568 | 0.022965 | 0.362292 | 0.530915 | 0.586126 | 0.082289 | 0.022675 | 0.401574 |
| 89 | Financials | 4 | 0.038663 | 0.496992 | 0.110930 | 0.020742 | 0.103340 | 0.422119 | 0.538357 | 0.596100 | 0.068369 | 0.014276 | 0.369148 |
| 90 | Health Care | 5 | 0.052454 | 0.614637 | 0.168533 | 0.015284 | 0.011482 | 0.348711 | 0.505253 | 0.594123 | 0.018989 | 0.021946 | 0.348794 |
| 91 | Industrials | 6 | 0.051885 | 0.548576 | 0.119136 | 0.014192 | 0.013570 | 0.281284 | 0.560463 | 0.593135 | 0.109324 | 0.022289 | 0.303922 |
| 92 | Consumer Discretionary | 1 | 0.079169 | 0.481290 | 0.118353 | 0.019651 | 0.027140 | 0.377080 | 0.665207 | 0.594393 | 0.271651 | 0.034837 | 0.351558 |
| 93 | Consumer Discretionary | 1 | 0.017458 | 0.481071 | 0.248532 | 0.019651 | 0.026096 | 0.351276 | 0.512028 | 0.528080 | 0.013675 | 0.034059 | 0.000000 |
| 94 | Consumer Discretionary | 1 | 0.016309 | 0.496202 | 0.280482 | 0.019651 | 0.026096 | 0.351276 | 0.512028 | 0.528080 | 0.013675 | 0.034059 | 0.000000 |
| 95 | Consumer Discretionary | 1 | 0.063938 | 0.579744 | 0.183973 | 0.068777 | 0.014614 | 0.340392 | 0.520700 | 0.595561 | 0.042038 | 0.026548 | 0.367753 |
| 96 | Real Estate | 9 | 0.055980 | 0.613650 | 0.087667 | 0.006550 | 0.040710 | 0.351252 | 0.496658 | 0.563932 | 0.026504 | 0.086724 | 0.334480 |
| 97 | Industrials | 6 | 0.078264 | 0.449605 | 0.157208 | 0.015284 | 0.039666 | 0.352005 | 0.493992 | 0.591967 | 0.001369 | 0.036701 | 0.313093 |
| 98 | Industrials | 6 | 0.044716 | 0.529541 | 0.201307 | 0.025109 | 0.027140 | 0.340567 | 0.508606 | 0.599515 | 0.021186 | 0.015562 | 0.359803 |
| 99 | Consumer Staples | 2 | 0.069818 | 0.637876 | 0.108564 | 0.037118 | 0.060543 | 0.371919 | 0.506400 | 0.585857 | 0.026637 | 0.038782 | 0.309000 |
| 100 | Utilities | 11 | 0.052651 | 0.453079 | 0.094509 | 0.006550 | 0.008351 | 0.313681 | 0.549177 | 0.586306 | 0.108884 | 0.027979 | 0.349405 |
| 101 | Health Care | 5 | 0.051328 | 0.425786 | 0.124381 | 0.005459 | 0.082463 | 0.367255 | 0.496096 | 0.561326 | 0.030125 | 0.098941 | 0.380545 |
| 102 | Energy | 3 | 0.021646 | 0.309759 | 0.569432 | 0.222707 | 0.073069 | 0.376517 | 0.189160 | 0.230479 | 0.061796 | 0.171688 | 0.379683 |
| 103 | Information Technology | 7 | 0.018088 | 0.580272 | 0.175763 | 0.027293 | 0.282881 | 0.209934 | 0.526433 | 0.562764 | 0.192220 | 0.031006 | 0.393407 |
| 104 | Materials | 8 | 0.086489 | 0.498265 | 0.089775 | 0.015284 | 0.002088 | 0.346904 | 0.511363 | 0.580286 | 0.043839 | 0.058855 | 0.298222 |
| 105 | Utilities | 11 | 0.047046 | 0.422341 | 0.087042 | 0.008734 | 0.020877 | 0.358345 | 0.515343 | 0.586486 | 0.043292 | 0.024482 | 0.356249 |
| 106 | Industrials | 6 | 0.084120 | 0.603444 | 0.090431 | 0.019651 | 0.015658 | 0.349462 | 0.499418 | 0.582352 | 0.014872 | 0.053161 | 0.331421 |
| 107 | Utilities | 11 | 0.043063 | 0.401195 | 0.050456 | 0.009825 | 0.003132 | 0.351464 | 0.513759 | 0.578039 | 0.053688 | 0.030435 | 0.339938 |
| 108 | Materials | 8 | 0.049597 | 0.496998 | 0.174517 | 0.022926 | 0.014614 | 0.353028 | 0.508151 | 0.601222 | 0.019708 | 0.016926 | 0.310991 |
| 109 | Energy | 3 | 0.052178 | 0.421320 | 0.314005 | 0.037118 | 0.040710 | 0.307747 | 0.396154 | 0.475425 | 0.084498 | 0.171688 | 0.377883 |
| 110 | Real Estate | 9 | 0.234484 | 0.559293 | 0.149451 | 0.006550 | 0.171190 | 0.401161 | 0.494388 | 0.579118 | 0.004910 | 0.171606 | 0.487250 |
| 111 | Real Estate | 9 | 0.060679 | 0.539896 | 0.083970 | 0.007642 | 0.049061 | 0.350625 | 0.508612 | 0.571210 | 0.055363 | 0.059970 | 0.364794 |
| 112 | Energy | 3 | 0.037491 | 0.253235 | 0.424167 | 0.001092 | 0.209812 | 0.366940 | 0.492249 | 0.554947 | 0.020291 | 0.171688 | 0.417611 |
| 113 | Utilities | 11 | 0.036656 | 0.468183 | 0.129889 | 0.007642 | 0.001044 | 0.350095 | 0.508786 | 0.574805 | 0.047209 | 0.029521 | 0.365279 |
| 114 | Real Estate | 9 | 0.184903 | 0.527442 | 0.100149 | 0.003275 | 0.049061 | 0.350684 | 0.495312 | 0.581364 | 0.006303 | 0.124675 | 0.355923 |
| 115 | Financials | 4 | 0.019788 | 0.585102 | 0.186875 | 0.004367 | 0.103340 | 0.364632 | 0.496060 | 0.558181 | 0.042996 | 0.055764 | 0.368069 |
| 116 | Industrials | 6 | 0.037420 | 0.472651 | 0.204911 | 0.013100 | 0.010438 | 0.334511 | 0.531728 | 0.588103 | 0.071429 | 0.017728 | 0.328874 |
| 117 | Utilities | 11 | 0.050266 | 0.509297 | 0.125878 | 0.001092 | 0.045929 | 0.348334 | 0.487206 | 0.541019 | 0.021307 | 0.029558 | 0.401070 |
| 118 | Health Care | 5 | 0.058625 | 0.574923 | 0.242617 | 0.020742 | 0.268267 | 0.352577 | 0.500790 | 0.570581 | 0.030579 | 0.059805 | 0.401927 |
| 119 | Utilities | 11 | 0.018316 | 0.398565 | 0.160762 | 0.008734 | 0.077244 | 0.495183 | 0.537774 | 0.574041 | 0.044395 | 0.027381 | 0.362618 |
| 120 | Industrials | 6 | 0.031957 | 0.417694 | 0.085625 | 0.028384 | 0.098121 | 0.346825 | 0.500005 | 0.571660 | 0.026300 | 0.029901 | 0.400181 |
| 121 | Consumer Discretionary | 1 | 0.094297 | 0.509136 | 0.219810 | 0.016376 | 0.030271 | 0.359114 | 0.506410 | 0.602660 | 0.016726 | 0.034736 | 0.156034 |
| 122 | Real Estate | 9 | 0.065890 | 0.597486 | 0.117731 | 0.019651 | 0.040710 | 0.351437 | 0.498706 | 0.564112 | 0.036254 | 0.100730 | 0.302011 |
| 123 | Consumer Discretionary | 1 | 0.007549 | 0.484706 | 0.108735 | 0.027293 | 0.044885 | 0.460497 | 0.644173 | 0.566628 | 0.641968 | 0.008836 | 0.395879 |
| 124 | Industrials | 6 | 0.028588 | 0.568733 | 0.176307 | 0.030568 | 0.036534 | 0.351011 | 0.501237 | 0.565819 | 0.043065 | 0.038329 | 0.392558 |
| 125 | Information Technology | 7 | 0.078838 | 0.620015 | 0.152706 | 0.007642 | 1.000000 | 0.369073 | 0.566959 | 0.561686 | 0.452261 | 0.146557 | 0.399659 |
| 126 | Industrials | 6 | 0.040143 | 0.625821 | 0.159905 | 0.013100 | 0.032359 | 0.352014 | 0.497040 | 0.567616 | 0.021565 | 0.048061 | 0.360742 |
| 127 | Materials | 8 | 0.001787 | 0.151148 | 0.796294 | 0.168122 | 0.005219 | 0.343050 | 0.237065 | 0.448288 | 0.170775 | 0.037853 | 0.385287 |
| 128 | Utilities | 11 | 0.021433 | 0.472782 | 0.131437 | 0.004367 | 0.002088 | 0.351995 | 0.502522 | 0.562225 | 0.064294 | 0.038516 | 0.341384 |
| 129 | Information Technology | 7 | 0.044158 | 0.357954 | 0.107914 | 0.006550 | 0.030271 | 0.356650 | 0.504040 | 0.569863 | 0.043297 | 0.046394 | 0.277958 |
| 130 | Information Technology | 7 | 0.068448 | 0.512471 | 0.044536 | 0.028384 | 0.018789 | 0.349962 | 0.505316 | 0.577231 | 0.033684 | 0.051704 | 0.332109 |
| 131 | Information Technology | 7 | 0.018552 | 0.463332 | 0.267237 | 0.015284 | 0.084551 | 0.348724 | 0.495511 | 0.565460 | 0.018271 | 0.025309 | 0.390547 |
| 132 | Industrials | 6 | 0.033626 | 0.567117 | 0.270685 | 0.014192 | 0.076200 | 0.349204 | 0.499073 | 0.575523 | 0.019093 | 0.025962 | 0.444049 |
| 133 | Industrials | 6 | 0.029580 | 0.482867 | 0.272558 | 0.016376 | 0.028184 | 0.347932 | 0.496053 | 0.567976 | 0.017205 | 0.034279 | 0.403862 |
| 134 | Materials | 8 | 0.027258 | 0.608897 | 0.374999 | 0.027293 | 0.005219 | 0.349590 | 0.500667 | 0.582802 | 0.017277 | 0.014770 | 0.395844 |
| 135 | Real Estate | 9 | 0.111457 | 0.527843 | 0.131716 | 0.013100 | 0.049061 | 0.349715 | 0.494856 | 0.577231 | 0.006765 | 0.085933 | 0.351615 |
| 136 | Information Technology | 7 | 0.048400 | 1.000000 | 0.348868 | 0.009825 | 0.198330 | 0.339446 | 0.501864 | 0.598616 | 0.011929 | 0.017596 | 0.480949 |
| 137 | Telecommunications Services | 10 | 0.000134 | 0.438714 | 0.336287 | 0.002183 | 0.517745 | 0.358501 | 0.486387 | 0.547309 | 0.105713 | 0.022060 | 0.422143 |
| 138 | Industrials | 6 | 0.104577 | 0.456697 | 0.053649 | 0.029476 | 0.022965 | 0.300419 | 0.552283 | 0.632851 | 0.047877 | 0.022751 | 0.391659 |
| 139 | Real Estate | 9 | 0.017876 | 0.502466 | 0.170834 | 0.017467 | 0.049061 | 0.350070 | 0.519128 | 0.577231 | 0.021304 | 0.084615 | 0.351615 |
| 140 | Health Care | 5 | 0.076107 | 0.487554 | 0.197796 | 0.105895 | 0.154489 | 0.438884 | 0.867959 | 0.661066 | 0.234227 | 0.009988 | 0.386378 |
| 141 | Information Technology | 7 | 0.010847 | 0.525718 | 0.219742 | 0.006550 | 0.170146 | 0.325253 | 0.518386 | 0.559080 | 0.209581 | 0.028539 | 0.388611 |
| 142 | Consumer Discretionary | 1 | 0.023228 | 0.581427 | 0.158921 | 0.025109 | 0.034447 | 0.229920 | 0.692412 | 0.604816 | 0.254054 | 0.005010 | 0.347138 |
| 143 | Consumer Discretionary | 1 | 0.064064 | 0.500709 | 0.115383 | 0.022926 | 0.005219 | 0.352868 | 0.505184 | 0.591697 | 0.020237 | 0.029586 | 0.406096 |
| 144 | Consumer Discretionary | 1 | 0.025715 | 0.494447 | 0.242355 | 0.014192 | 0.126305 | 0.339197 | 0.499984 | 0.571390 | 0.026619 | 0.024027 | 0.399053 |
| 145 | Consumer Discretionary | 1 | 0.022173 | 0.563468 | 0.205261 | 0.007642 | 0.031315 | 0.329132 | 0.496873 | 0.560158 | 0.039407 | 0.048985 | 0.389675 |
| 146 | Industrials | 6 | 0.155921 | 0.409013 | 0.159983 | 0.036026 | 0.016701 | 0.352543 | 0.506504 | 0.654956 | 0.006215 | 0.027413 | 0.430015 |
| 147 | Energy | 3 | 0.023252 | 0.411307 | 0.320493 | 0.003275 | 0.197286 | 0.594082 | 0.476485 | 0.542816 | 0.134009 | 0.171688 | 0.455518 |
| 148 | Consumer Discretionary | 1 | 0.049479 | 0.391978 | 0.221008 | 0.028384 | 0.096033 | 0.353171 | 0.499892 | 0.582352 | 0.015900 | 0.029944 | 0.430173 |
| 149 | Financials | 4 | 0.005164 | 0.501782 | 0.157174 | 0.010917 | 0.103340 | 0.338877 | 0.504919 | 0.557283 | 0.133309 | 0.020096 | 0.370663 |
| 150 | Health Care | 5 | 0.049691 | 0.338588 | 0.307196 | 0.029476 | 0.013570 | 0.356030 | 0.534855 | 0.596100 | 0.063039 | 0.019467 | 0.335504 |
| 151 | Real Estate | 9 | 0.050006 | 0.461667 | 0.158198 | 0.005459 | 0.049061 | 0.347028 | 0.508173 | 0.571031 | 0.054412 | 0.049540 | 0.352762 |
| 152 | Real Estate | 9 | 0.023871 | 0.482949 | 0.142745 | 0.005459 | 0.049061 | 0.355645 | 0.478815 | 0.539042 | 0.070863 | 0.058496 | 0.364794 |
| 153 | Energy | 3 | 0.034618 | 0.416357 | 0.432927 | 0.016376 | 0.107516 | 0.359064 | 0.426767 | 0.453051 | 0.041721 | 0.048509 | 0.401513 |
| 154 | Financials | 4 | 0.030666 | 0.412250 | 0.107664 | 0.009825 | 0.103340 | 0.352089 | 0.525537 | 0.586306 | 0.063219 | 0.014845 | 0.349891 |
| 155 | Consumer Discretionary | 1 | 0.032185 | 0.292447 | 0.215034 | 0.043668 | 0.032359 | 0.344787 | 0.506154 | 0.583251 | 0.028553 | 0.017709 | 0.400560 |
| 156 | Industrials | 6 | 0.077980 | 0.552448 | 0.096256 | 0.027293 | 0.042797 | 0.303516 | 0.589869 | 0.604816 | 0.122755 | 0.026691 | 0.352900 |
| 157 | Information Technology | 7 | 0.008422 | 0.286665 | 0.693375 | 0.006550 | 0.046973 | 0.585856 | 0.541776 | 0.566089 | 0.079313 | 0.042609 | 0.390255 |
| 158 | Information Technology | 7 | 0.005777 | 0.482392 | 0.426371 | 0.016376 | 0.018789 | 0.422495 | 0.585408 | 0.566089 | 0.079313 | 0.042609 | 0.390255 |
| 159 | Consumer Staples | 2 | 0.027581 | 0.700968 | 0.089759 | 0.017467 | 0.030271 | 0.350965 | 0.504776 | 0.579387 | 0.027980 | 0.040249 | 0.361328 |
| 160 | Health Care | 5 | 0.120973 | 0.640497 | 0.072984 | 0.017467 | 0.003132 | 0.350013 | 0.500460 | 0.601851 | 0.009004 | 0.046530 | 0.399860 |
| 161 | Real Estate | 9 | 0.008532 | 0.429746 | 0.223939 | 0.007642 | 0.049061 | 0.336638 | 0.502105 | 0.551891 | 0.409140 | 0.127197 | 0.370427 |
| 162 | Consumer Staples | 2 | 0.066724 | 0.429314 | 0.118335 | 0.054585 | 0.016701 | 0.349671 | 0.501166 | 0.579387 | 0.027980 | 0.040249 | 0.361328 |
| 163 | Health Care | 5 | 0.136967 | 0.459812 | 0.229288 | 0.012009 | 0.073069 | 0.370449 | 0.517073 | 0.626651 | 0.019855 | 0.034217 | 0.328066 |
| 164 | Information Technology | 7 | 0.104782 | 0.409444 | 0.090909 | 0.099345 | 0.025052 | 0.325848 | 0.765437 | 0.671040 | 0.155067 | 0.013852 | 0.394629 |
| 165 | Health Care | 5 | 0.053855 | 0.445911 | 0.191434 | 0.247817 | 0.041754 | 0.344503 | 0.494477 | 0.568515 | 0.010620 | 0.061496 | 0.366199 |
| 166 | Materials | 8 | 0.090629 | 0.607653 | 0.109099 | 0.027293 | 0.026096 | 0.341280 | 0.499213 | 0.596550 | 0.008661 | 0.038310 | 0.395581 |
| 167 | Information Technology | 7 | 0.023574 | 0.598590 | 0.128119 | 0.019651 | 0.169102 | 0.749249 | 0.728539 | 0.571570 | 0.768299 | 0.021632 | 0.390255 |
| 168 | Materials | 8 | 0.026132 | 0.460976 | 0.147774 | 0.025109 | 0.028184 | 0.324565 | 0.510027 | 0.570132 | 0.063477 | 0.026319 | 0.400826 |
| 169 | Consumer Discretionary | 1 | 0.014782 | 0.674788 | 0.105706 | 0.024017 | 0.020877 | 0.345624 | 0.499950 | 0.559889 | 0.062280 | 0.034350 | 0.372275 |
| 170 | Real Estate | 9 | 0.017718 | 0.333353 | 0.147709 | 0.025109 | 0.015658 | 0.350633 | 0.493042 | 0.555126 | 0.030141 | 0.083095 | 0.357511 |
| 171 | Health Care | 5 | 0.426353 | 0.644567 | 0.102121 | 0.014192 | 0.330898 | 0.354132 | 0.502748 | 0.692515 | 0.001538 | 0.059948 | 0.578636 |
| 172 | Industrials | 6 | 0.069408 | 0.586276 | 0.106503 | 0.038210 | 0.135699 | 0.322407 | 0.530060 | 0.596280 | 0.055508 | 0.028615 | 0.407954 |
| 173 | Financials | 4 | 0.022811 | 0.530402 | 0.220354 | 0.012009 | 0.069937 | 0.363443 | 0.510655 | 0.570222 | 0.065348 | 0.022622 | 0.391540 |
| 174 | Industrials | 6 | 0.054201 | 0.490217 | 0.126131 | 0.034934 | 0.001044 | 0.350544 | 0.499380 | 0.583071 | 0.014370 | 0.032270 | 0.384743 |
| 175 | Industrials | 6 | 0.029478 | 0.574167 | 0.259906 | 0.006550 | 0.024008 | 0.342056 | 0.496789 | 0.571660 | 0.015905 | 0.027422 | 0.401660 |
| 176 | Information Technology | 7 | 0.018183 | 0.533179 | 0.288183 | 0.014192 | 0.112735 | 0.343716 | 0.503684 | 0.564471 | 0.059283 | 0.026855 | 0.384394 |
| 177 | Financials | 4 | 0.048432 | 0.539854 | 0.103246 | 0.009825 | 0.103340 | 0.120950 | 1.000000 | 0.604277 | 0.654367 | 0.015194 | 0.361784 |
| 178 | Real Estate | 9 | 0.017285 | 0.546476 | 0.127772 | 0.018559 | 0.049061 | 0.350626 | 0.509112 | 0.567976 | 0.068034 | 0.019479 | 0.364794 |
| 179 | Consumer Staples | 2 | 0.096659 | 0.632610 | 0.035676 | 0.634279 | 0.010438 | 0.345240 | 0.511591 | 0.574894 | 0.054915 | 0.081614 | 0.358635 |
| 180 | Energy | 3 | 0.008202 | 0.000000 | 0.625491 | 0.000000 | 0.007307 | 0.347867 | 0.495747 | 0.550813 | 0.408102 | 0.278544 | 0.361749 |
| 181 | Consumer Staples | 2 | 0.030273 | 0.527907 | 0.040747 | 0.030568 | 0.077244 | 0.298980 | 0.643715 | 0.565100 | 0.704876 | 0.042819 | 0.363834 |
| 182 | Industrials | 6 | 0.055232 | 0.280792 | 0.347938 | 0.012009 | 0.018789 | 0.343945 | 0.500552 | 0.589541 | 0.013368 | 0.026655 | 0.361174 |
| 183 | Industrials | 6 | 0.029533 | 0.480471 | 0.122404 | 0.031659 | 0.037578 | 0.348067 | 0.497336 | 0.570671 | 0.018729 | 0.029051 | 0.385265 |
| 184 | Consumer Discretionary | 1 | 0.034956 | 0.477923 | 0.217320 | 0.014192 | 0.026096 | 0.346698 | 0.507211 | 0.584689 | 0.029322 | 0.018478 | 0.369480 |
| 185 | Health Care | 5 | 0.093778 | 0.599959 | 0.226149 | 0.008734 | 0.043841 | 0.354823 | 0.499581 | 0.595112 | 0.024557 | 0.024691 | 0.364670 |
| 186 | Consumer Discretionary | 1 | 0.019780 | 0.504701 | 0.180425 | 0.014192 | 0.012526 | 0.349706 | 0.499296 | 0.577141 | 0.030323 | 0.034059 | 0.366802 |
| 187 | Industrials | 6 | 0.090527 | 0.603522 | 0.202832 | 0.005459 | 0.007307 | 0.343207 | 0.485470 | 0.523228 | 0.008666 | 0.027422 | 0.440580 |
| 188 | Health Care | 5 | 0.062781 | 0.468962 | 0.183905 | 0.017467 | 0.056367 | 0.344139 | 0.540680 | 0.570312 | 0.168519 | 0.065099 | 0.367807 |
| 189 | Industrials | 6 | 0.167382 | 0.512656 | 0.044175 | 0.125546 | 0.008351 | 0.339422 | 0.565624 | 0.654327 | 0.046084 | 0.029998 | 0.318087 |
| 190 | Utilities | 11 | 0.021036 | 0.526245 | 0.099478 | 0.009825 | 0.000000 | 0.348958 | 0.498570 | 0.580106 | 0.014339 | 0.012108 | 0.348529 |
| 191 | Financials | 4 | 0.010146 | 0.321359 | 0.213439 | 0.001092 | 0.084551 | 0.330598 | 0.495729 | 0.556564 | 0.051050 | 0.039163 | 0.467583 |
| 192 | Industrials | 6 | 0.030351 | 0.596829 | 0.208774 | 0.031659 | 0.042797 | 0.359971 | 0.535939 | 0.579567 | 0.103274 | 0.019259 | 0.346041 |
| 193 | Telecommunications Services | 10 | 0.039246 | 0.703036 | 0.188166 | 0.036026 | 0.063674 | 0.359127 | 0.562039 | 0.637164 | 0.053148 | 0.005071 | 0.360035 |
| 194 | Materials | 8 | 0.064859 | 0.486418 | 0.227868 | 0.073144 | 0.048017 | 0.347210 | 0.583782 | 0.636355 | 0.071369 | 0.011613 | 0.421259 |
| 195 | Information Technology | 7 | 0.073092 | 0.534605 | 0.094288 | 0.067686 | 0.135699 | 0.369636 | 0.569856 | 0.580106 | 0.180321 | 0.049592 | 0.387803 |
| 196 | Real Estate | 9 | 0.067937 | 0.565193 | 0.115578 | 0.012009 | 0.049061 | 0.350898 | 0.497785 | 0.589541 | 0.008458 | 0.033625 | 0.355923 |
| 197 | Real Estate | 9 | 0.059971 | 0.502176 | 0.113382 | 0.009825 | 0.049061 | 0.350607 | 0.500637 | 0.577590 | 0.021304 | 0.044301 | 0.351615 |
| 198 | Consumer Discretionary | 1 | 0.049227 | 0.441917 | 0.236370 | 0.025109 | 0.003132 | 0.350307 | 0.508380 | 0.578848 | 0.038994 | 0.034059 | 0.310692 |
| 199 | Industrials | 6 | 0.018734 | 0.575122 | 0.180717 | 0.286026 | 0.063674 | 0.353215 | 0.497874 | 0.559170 | 0.051697 | 0.046734 | 0.381798 |
| 200 | Consumer Discretionary | 1 | 0.017844 | 0.760948 | 0.308963 | 0.014192 | 0.056367 | 0.348091 | 0.498174 | 0.559619 | 0.051272 | 0.042319 | 0.393084 |
| 201 | Consumer Discretionary | 1 | 0.089449 | 0.656370 | 0.000000 | 0.068777 | 0.271399 | 0.525948 | 0.584893 | 0.593225 | 0.148740 | 0.041087 | 0.405691 |
| 202 | Financials | 4 | 0.075438 | 0.484195 | 0.139239 | 0.181223 | 0.191023 | 0.367381 | 0.510096 | 0.592147 | 0.028150 | 0.035066 | 0.412970 |
| 203 | Consumer Staples | 2 | 0.031753 | 0.520737 | 0.152951 | 0.027293 | 0.017745 | 0.358032 | 0.641964 | 0.590260 | 0.259444 | 0.013428 | 0.308551 |
| 204 | Financials | 4 | 0.034405 | 0.474611 | 0.105407 | 0.007642 | 0.103340 | 0.411360 | 0.601167 | 0.591338 | 0.183340 | 0.014325 | 0.361799 |
| 205 | Consumer Discretionary | 1 | 0.145531 | 0.495630 | 0.197385 | 0.013100 | 0.003132 | 0.350051 | 0.503300 | 0.573187 | 0.034232 | 0.133665 | 0.351581 |
| 206 | Consumer Staples | 2 | 0.058601 | 0.579469 | 0.256114 | 0.111354 | 0.141962 | 0.363183 | 0.504096 | 0.579387 | 0.027980 | 0.040249 | 0.402638 |
| 207 | Consumer Staples | 2 | 0.063804 | 0.529514 | 0.077740 | 0.025109 | 0.009395 | 0.351661 | 0.498845 | 0.578129 | 0.016346 | 0.046301 | 0.361328 |
| 208 | Materials | 8 | 0.103963 | 0.354895 | 0.371986 | 0.006550 | 0.048017 | 0.352426 | 0.496493 | 0.588642 | 0.006415 | 0.054758 | 0.385849 |
| 209 | Financials | 4 | 0.040104 | 0.520179 | 0.078245 | 0.026201 | 0.103340 | 0.332291 | 0.523807 | 0.576961 | 0.082125 | 0.029492 | 0.361476 |
| 210 | Industrials | 6 | 0.115030 | 0.519249 | 0.064867 | 0.044760 | 0.028184 | 0.347460 | 0.591224 | 0.619283 | 0.097587 | 0.031570 | 0.380844 |
| 211 | Consumer Staples | 2 | 0.035541 | 0.566934 | 0.221681 | 0.010917 | 0.592902 | 0.407015 | 0.501871 | 0.583251 | 0.235153 | 0.042819 | 0.345682 |
| 212 | Consumer Staples | 2 | 0.042276 | 0.528624 | 0.058709 | 0.197598 | 0.034447 | 0.320781 | 0.599729 | 0.573906 | 0.315618 | 0.035928 | 0.338653 |
| 213 | Materials | 8 | 0.018175 | 0.351342 | 0.545250 | 0.009825 | 0.064718 | 0.316205 | 0.511328 | 0.574984 | 0.053965 | 0.013242 | 0.399475 |
| 214 | Energy | 3 | 0.037262 | 0.573857 | 0.326553 | 0.022926 | 0.018789 | 0.339078 | 0.549927 | 0.597448 | 0.083413 | 0.013072 | 0.399239 |
| 215 | Health Care | 5 | 0.038034 | 0.530049 | 0.141750 | 0.009825 | 0.073069 | 0.384430 | 0.583073 | 0.564112 | 0.453994 | 0.058074 | 0.350107 |
| 216 | Energy | 3 | 0.006368 | 0.262902 | 0.673851 | 0.012009 | 0.074113 | 0.313743 | 0.444528 | 0.520622 | 0.105747 | 0.171688 | 0.377187 |
| 217 | Financials | 4 | 0.091841 | 0.457695 | 0.168247 | 0.006550 | 0.103340 | 0.350390 | 0.512980 | 0.614790 | 0.019875 | 0.026373 | 0.366409 |
| 218 | Health Care | 5 | 0.263395 | 0.646621 | 0.099357 | 0.065502 | 0.018789 | 0.350983 | 0.497828 | 0.664480 | 0.000000 | 0.045063 | 0.392444 |
| 219 | Energy | 3 | 0.014129 | 0.377158 | 0.550581 | 0.045852 | 0.028184 | 0.322090 | 0.443135 | 0.432833 | 0.023910 | 0.048509 | 0.364654 |
| 220 | Health Care | 5 | 0.039018 | 0.785926 | 0.407120 | 0.008734 | 0.031315 | 0.382163 | 0.508143 | 0.564112 | 0.261814 | 0.058074 | 0.350446 |
| 221 | Financials | 4 | 0.005471 | 0.479520 | 0.389319 | 0.026201 | 0.103340 | 0.355280 | 0.511257 | 0.573816 | 0.056615 | 0.002607 | 0.361813 |
| 222 | Energy | 3 | 0.022378 | 0.532665 | 0.461744 | 0.025109 | 0.059499 | 0.345709 | 0.439587 | 0.495372 | 0.061072 | 0.171688 | 0.376688 |
| 223 | Financials | 4 | 0.042245 | 0.547458 | 0.215784 | 0.007642 | 0.122129 | 0.346616 | 0.499395 | 0.572918 | 0.022753 | 0.037682 | 0.313873 |
| 224 | Utilities | 11 | 0.078232 | 0.522283 | 0.075441 | 0.012009 | 0.006263 | 0.350369 | 0.547842 | 0.604816 | 0.068944 | 0.026791 | 0.335142 |
| 225 | Materials | 8 | 0.010618 | 0.567362 | 0.468662 | 0.001092 | 0.206681 | 0.362411 | 0.495059 | 0.553778 | 0.078928 | 0.074084 | 0.404959 |
| 226 | Information Technology | 7 | 0.086489 | 0.570313 | 0.486833 | 0.005459 | 0.067850 | 0.372317 | 0.493030 | 0.552520 | 0.064457 | 0.745526 | 0.343199 |
| 227 | Energy | 3 | 0.022087 | 0.428972 | 0.438892 | 0.265284 | 0.001044 | 0.350275 | 0.420388 | 0.359601 | 0.021375 | 0.171688 | 0.370305 |
| 228 | Industrials | 6 | 0.033138 | 0.509496 | 0.120963 | 0.013100 | 0.021921 | 0.353184 | 0.502356 | 0.563842 | 0.055462 | 0.051664 | 0.310666 |
| 229 | Energy | 3 | 0.022819 | 0.338329 | 0.316843 | 0.004367 | 0.051148 | 0.305017 | 0.474442 | 0.532033 | 0.058510 | 0.171688 | 0.417682 |
| 230 | Industrials | 6 | 0.063041 | 0.554501 | 0.373199 | 0.013100 | 0.051148 | 0.354560 | 0.522910 | 0.596010 | 0.044954 | 0.025812 | 0.375495 |
| 231 | Financials | 4 | 0.053202 | 0.517967 | 0.142558 | 0.010917 | 0.069937 | 0.456712 | 0.510773 | 0.586126 | 0.034895 | 0.028476 | 0.305681 |
| 232 | Materials | 8 | 0.028179 | 0.525685 | 0.189103 | 0.004367 | 0.153445 | 0.379186 | 0.497929 | 0.559889 | 0.048037 | 0.063551 | 0.425410 |
| 233 | Consumer Discretionary | 1 | 0.031154 | 0.558905 | 0.236071 | 0.019651 | 0.014614 | 0.352915 | 0.497769 | 0.561596 | 0.039396 | 0.058983 | 0.360865 |
| 234 | Real Estate | 9 | 0.037097 | 0.543646 | 0.096551 | 0.003275 | 0.049061 | 0.351697 | 0.496389 | 0.559709 | 0.037945 | 0.084615 | 0.351615 |
| 235 | Energy | 3 | 0.015868 | 0.225155 | 0.734885 | 0.078603 | 0.006263 | 0.348206 | 0.495580 | 0.560428 | 0.029635 | 0.034548 | 0.331868 |
| 236 | Consumer Discretionary | 1 | 0.056012 | 0.606177 | 0.086617 | 0.048035 | 0.018789 | 0.357347 | 0.513277 | 0.589721 | 0.035759 | 0.026935 | 0.319982 |
| 237 | Consumer Discretionary | 1 | 0.195931 | 0.470671 | 0.092596 | 0.050218 | 0.004175 | 0.346357 | 0.509886 | 0.633660 | 0.011782 | 0.046192 | 0.369206 |
| 238 | Energy | 3 | 0.049675 | 0.469704 | 0.222611 | 0.033843 | 0.066806 | 0.332166 | 0.327267 | 0.457993 | 0.120299 | 0.171688 | 0.387283 |
| 239 | Financials | 4 | 0.009170 | 0.491858 | 0.103889 | 0.004367 | 0.103340 | 0.341224 | 0.495895 | 0.557642 | 0.044812 | 0.030172 | 0.368900 |
| 240 | Industrials | 6 | 0.012712 | 0.498630 | 0.136851 | 0.247817 | 0.035491 | 0.337934 | 0.498977 | 0.568245 | 0.028100 | 0.013687 | 0.367407 |
| 241 | Industrials | 6 | 0.033768 | 0.370055 | 0.183630 | 0.024017 | 0.114823 | 0.359277 | 0.523911 | 0.590529 | 0.053362 | 0.014381 | 0.401638 |
| 242 | Utilities | 11 | 0.038325 | 0.466229 | 0.079711 | 0.004367 | 0.006263 | 0.349681 | 0.508985 | 0.566178 | 0.075500 | 0.050373 | 0.365517 |
| 243 | Consumer Discretionary | 1 | 1.000000 | 0.492460 | 0.139120 | 0.030568 | 0.192067 | 0.298280 | 0.543660 | 1.000000 | 0.003794 | 0.042882 | 0.365851 |
| 244 | Utilities | 11 | 0.026912 | 0.380687 | 0.116328 | 0.013100 | 0.011482 | 0.350307 | 0.525474 | 0.579747 | 0.077965 | 0.016603 | 0.369217 |
| 245 | Consumer Staples | 2 | 0.075107 | 0.520661 | 0.018767 | 0.048035 | 0.070981 | 0.443200 | 0.604128 | 0.583251 | 0.235153 | 0.045700 | 0.345682 |
| 246 | Health Care | 5 | 0.021866 | 0.491877 | 0.131427 | 0.010917 | 0.082463 | 0.359877 | 0.635564 | 0.560068 | 1.000000 | 0.048811 | 0.350446 |
| 247 | Financials | 4 | 0.031863 | 0.409316 | 0.206875 | 0.013100 | 0.122129 | 0.372479 | 0.516198 | 0.586845 | 0.044453 | 0.015251 | 0.348917 |
| 248 | Consumer Staples | 2 | 0.058963 | 0.565565 | 0.018949 | 0.017467 | 0.134656 | 0.355573 | 0.503733 | 0.579387 | 0.075628 | 0.040249 | 0.359982 |
| 249 | Financials | 4 | 0.021488 | 0.495641 | 0.091954 | 0.017467 | 0.103340 | 0.354185 | 0.516898 | 0.569323 | 0.091196 | 0.022447 | 0.366870 |
| 250 | Consumer Discretionary | 1 | 0.010484 | 0.406780 | 0.249966 | 0.009825 | 0.026096 | 0.333861 | 0.500773 | 0.562315 | 0.053879 | 0.019001 | 0.369480 |
| 251 | Consumer Staples | 2 | 0.065654 | 0.562319 | 0.033349 | 0.055677 | 0.022965 | 0.404823 | 0.633750 | 0.589631 | 0.249087 | 0.032286 | 0.364069 |
| 252 | Financials | 4 | 0.071479 | 0.529680 | 0.100697 | 0.008734 | 0.103340 | 0.341330 | 0.576068 | 0.617486 | 0.084535 | 0.018546 | 0.367597 |
| 253 | Industrials | 6 | 0.035444 | 0.431539 | 0.297058 | 0.001092 | 0.008351 | 0.351054 | 0.488881 | 0.546141 | 0.025154 | 0.022174 | 0.338936 |
| 254 | Utilities | 11 | 0.047212 | 0.466117 | 0.106647 | 0.009825 | 0.003132 | 0.351554 | 0.499588 | 0.585318 | 0.013586 | 0.025576 | 0.341303 |
| 255 | Materials | 8 | 0.074241 | 0.648753 | 0.207919 | 0.029476 | 0.032359 | 0.370105 | 0.519783 | 0.596460 | 0.039754 | 0.030740 | 0.368089 |
| 256 | Utilities | 11 | 0.023322 | 0.494747 | 0.097714 | 0.006550 | 0.022965 | 0.332948 | 0.504690 | 0.558990 | 0.105612 | 0.058763 | 0.357202 |
| 257 | Financials | 4 | 0.060538 | 0.525685 | 0.128495 | 0.013100 | 0.122129 | 0.434818 | 0.608088 | 0.661066 | 0.069872 | 0.006943 | 0.350613 |
| 258 | Energy | 3 | 0.060845 | 0.513805 | 0.168039 | 0.018559 | 0.042797 | 0.283842 | 0.578591 | 0.619822 | 0.084096 | 0.014433 | 0.405238 |
| 259 | Industrials | 6 | 0.012397 | 0.298463 | 0.577385 | 0.009825 | 0.011482 | 0.348626 | 0.497182 | 0.564202 | 0.028497 | 0.018664 | 0.391898 |
| 260 | Materials | 8 | 0.077059 | 0.464111 | 0.103480 | 0.037118 | 0.008351 | 0.351214 | 0.522723 | 0.598347 | 0.042296 | 0.030590 | 0.373782 |
| 261 | Information Technology | 7 | 0.024952 | 0.632071 | 0.310015 | 0.008734 | 0.026096 | 0.325285 | 0.516073 | 0.558900 | 0.195760 | 0.063349 | 0.397464 |
| 262 | Industrials | 6 | 0.041190 | 0.233756 | 0.315269 | 0.015284 | 0.004175 | 0.350896 | 0.496827 | 0.601581 | 0.004131 | 0.013232 | 0.312394 |
| 263 | Consumer Discretionary | 1 | 0.076123 | 0.592629 | 0.214030 | 0.007642 | 0.009395 | 0.348440 | 0.504352 | 0.577141 | 0.031323 | 0.058021 | 0.294329 |
| 264 | Health Care | 5 | 0.423763 | 0.627561 | 0.277935 | 0.017467 | 0.134656 | 0.355573 | 0.503733 | 0.605355 | 0.012300 | 0.161968 | 0.470447 |
| 265 | Industrials | 6 | 0.033563 | 0.386279 | 0.106374 | 0.038210 | 0.035491 | 0.348601 | 0.497932 | 0.574355 | 0.016940 | 0.027414 | 0.390911 |
| 266 | Industrials | 6 | 0.145846 | 0.661201 | 0.084651 | 0.013100 | 0.112735 | 0.355814 | 0.504984 | 0.612095 | 0.011892 | 0.046640 | 0.291951 |
| 267 | Energy | 3 | 0.015829 | 0.215530 | 0.774610 | 0.027293 | 0.000000 | 0.350557 | 0.475595 | 0.511367 | 0.022618 | 0.171688 | 0.373539 |
| 268 | Industrials | 6 | 0.031083 | 0.527255 | 0.027726 | 0.009825 | 0.002088 | 0.349218 | 0.506106 | 0.569144 | 0.052637 | 0.033556 | 0.359146 |
| 269 | Utilities | 11 | 0.044071 | 0.532019 | 0.138574 | 0.014192 | 0.009395 | 0.351777 | 0.506025 | 0.596819 | 0.018794 | 0.016478 | 0.351405 |
| 270 | Financials | 4 | 0.022378 | 0.612563 | 0.188147 | 0.010917 | 0.103340 | 0.369792 | 0.520638 | 0.559260 | 0.222400 | 0.054709 | 0.370346 |
| 271 | Energy | 3 | 0.015302 | 0.364365 | 0.337318 | 0.002183 | 0.006263 | 0.350494 | 0.494559 | 0.552520 | 0.105713 | 0.151620 | 0.358404 |
| 272 | Materials | 8 | 0.031564 | 0.410867 | 0.220167 | 0.045852 | 0.019833 | 0.352840 | 0.495172 | 0.564561 | 0.018039 | 0.046518 | 0.357739 |
| 273 | Materials | 8 | 0.200795 | 0.623085 | 0.180230 | 0.131004 | 0.010438 | 0.355718 | 0.512442 | 0.652170 | 0.010590 | 0.037853 | 0.384750 |
| 274 | Real Estate | 9 | 0.085387 | 0.500425 | 0.093271 | 0.003275 | 0.049061 | 0.349743 | 0.496395 | 0.559080 | 0.040910 | 0.205349 | 0.355923 |
| 275 | Consumer Discretionary | 1 | 0.039915 | 0.581006 | 0.270532 | 0.042576 | 0.024008 | 0.330079 | 0.503123 | 0.591967 | 0.016634 | 0.016876 | 0.332178 |
| 276 | Utilities | 11 | 0.033287 | 0.504023 | 0.042085 | 0.010917 | 0.015658 | 0.372263 | 0.540942 | 0.573277 | 0.147348 | 0.028681 | 0.357356 |
| 277 | Real Estate | 9 | 0.149506 | 0.512956 | 0.104600 | 0.051310 | 0.049061 | 0.353336 | 0.535071 | 0.602750 | 0.054825 | 0.057384 | 0.364794 |
| 278 | Financials | 4 | 0.074053 | 0.598682 | 0.090384 | 0.649563 | 0.053236 | 0.318779 | 0.514572 | 0.588193 | 0.039743 | 0.038479 | 0.350613 |
| 279 | Industrials | 6 | 0.091385 | 0.325141 | 0.122347 | 0.009825 | 0.009395 | 0.351601 | 0.496040 | 0.577051 | 0.009908 | 0.070459 | 0.279042 |
| 280 | Utilities | 11 | 0.070455 | 0.433913 | 0.102235 | 0.010917 | 0.009395 | 0.345333 | 0.518616 | 0.598706 | 0.036034 | 0.027381 | 0.329324 |
| 281 | Financials | 4 | 0.030178 | 0.578387 | 0.183207 | 0.007642 | 0.103340 | 0.268297 | 0.530769 | 0.582442 | 0.082573 | 0.016947 | 0.358887 |
| 282 | Financials | 4 | 0.048691 | 0.452756 | 0.184950 | 0.008734 | 0.122129 | 0.330289 | 0.531749 | 0.590619 | 0.066771 | 0.022307 | 0.351266 |
| 283 | Information Technology | 7 | 0.056933 | 0.377914 | 0.333837 | 0.026201 | 0.234864 | 0.357994 | 0.507115 | 0.587744 | 0.026412 | 0.029164 | 0.407113 |
| 284 | Energy | 3 | 0.002054 | 0.022818 | 1.000000 | 0.217249 | 0.002088 | 0.349368 | 0.395497 | 0.495372 | 0.061072 | 0.171688 | 0.377187 |
| 285 | Financials | 4 | 0.020394 | 0.433105 | 0.286431 | 0.018559 | 0.103340 | 0.366102 | 0.536627 | 0.573816 | 0.131231 | 0.016181 | 0.366876 |
| 286 | Health Care | 5 | 0.069613 | 0.445080 | 0.105280 | 0.017467 | 0.121086 | 0.400100 | 0.520471 | 0.584239 | 0.056923 | 0.040743 | 0.405227 |
| 287 | Telecommunications Services | 10 | 0.023543 | 0.519388 | 0.032826 | 0.010917 | 0.011482 | 0.241649 | 0.768668 | 0.571210 | 0.913809 | 0.022060 | 0.256267 |
| 288 | Consumer Staples | 2 | 0.070385 | 0.589726 | 0.125983 | 0.004367 | 0.036534 | 0.345856 | 0.497967 | 0.567347 | 0.025709 | 0.086606 | 0.247261 |
| 289 | Information Technology | 7 | 0.017254 | 0.374785 | 0.519251 | 0.026201 | 0.091858 | 0.350713 | 0.486012 | 0.536167 | 0.018298 | 0.136392 | 0.390807 |
| 290 | Consumer Discretionary | 1 | 0.016545 | 0.594569 | 0.276615 | 0.021834 | 0.021921 | 0.350892 | 0.500053 | 0.568245 | 0.032224 | 0.018233 | 0.308955 |
| 291 | Financials | 4 | 0.041450 | 0.472668 | 0.075335 | 0.013100 | 0.103340 | 0.350412 | 0.501461 | 0.587744 | 0.015906 | 0.020266 | 0.361799 |
| 292 | Health Care | 5 | 0.108111 | 0.613976 | 0.133768 | 0.008734 | 0.011482 | 0.322676 | 0.531653 | 0.594483 | 0.060440 | 0.048873 | 0.234359 |
| 293 | Consumer Discretionary | 1 | 0.063560 | 0.801845 | 0.219706 | 0.014192 | 0.221294 | 0.355530 | 0.494601 | 0.562315 | 0.018887 | 0.112054 | 0.383786 |
| 294 | Financials | 4 | 0.085293 | 0.588750 | 0.058801 | 0.015284 | 0.103340 | 0.350744 | 0.562164 | 0.648666 | 0.046521 | 0.013967 | 0.366635 |
| 295 | Consumer Discretionary | 1 | 0.063757 | 0.473991 | 0.181432 | 0.030568 | 0.010438 | 0.350953 | 0.499028 | 0.577141 | 0.017576 | 0.048147 | 0.400276 |
| 296 | Consumer Staples | 2 | 0.038435 | 0.688766 | 0.221883 | 0.013100 | 0.019833 | 0.358376 | 0.515906 | 0.579387 | 0.027980 | 0.040249 | 0.361328 |
| 297 | Energy | 3 | 0.079397 | 0.545244 | 0.291397 | 0.031659 | 0.038622 | 0.348743 | 0.522577 | 0.662234 | 0.015580 | 0.010463 | 0.393371 |
| 298 | Information Technology | 7 | 0.035657 | 0.551609 | 0.219941 | 0.020742 | 0.120042 | 0.353689 | 0.498166 | 0.567706 | 0.025884 | 0.042308 | 0.359540 |
| 299 | Information Technology | 7 | 0.039600 | 0.558826 | 0.137856 | 0.031659 | 0.131524 | 0.344333 | 0.552720 | 0.575613 | 0.165761 | 0.030906 | 0.384471 |
| 300 | Consumer Discretionary | 1 | 0.059908 | 0.295371 | 0.266622 | 0.014192 | 0.028184 | 0.336065 | 0.495322 | 0.577141 | 0.030323 | 0.034059 | 0.366802 |
| 301 | Industrials | 6 | 0.041560 | 0.541635 | 0.263705 | 0.088428 | 0.043841 | 0.381959 | 0.643486 | 0.725312 | 0.056812 | 0.000000 | 0.257934 |
| 302 | Real Estate | 9 | 0.026030 | 0.545241 | 0.110412 | 0.012009 | 0.049061 | 0.350291 | 0.497569 | 0.561596 | 0.038189 | 0.049447 | 0.355923 |
| 303 | Health Care | 5 | 0.090511 | 0.410967 | 0.341974 | 0.016376 | 0.006263 | 0.351469 | 0.504660 | 0.611825 | 0.011595 | 0.027437 | 0.401469 |
| 304 | Health Care | 5 | 0.089055 | 0.475585 | 0.194752 | 0.017467 | 0.073069 | 0.457776 | 0.611653 | 0.604726 | 0.150903 | 0.031136 | 0.328066 |
| 305 | Financials | 4 | 0.022661 | 0.498465 | 0.096100 | 0.009825 | 0.122129 | 0.350882 | 0.508549 | 0.581454 | 0.035776 | 0.012472 | 0.350613 |
| 306 | Industrials | 6 | 0.058011 | 0.340165 | 0.181221 | 0.024017 | 0.044885 | 0.344458 | 0.589952 | 0.599425 | 0.136732 | 0.021438 | 0.376180 |
| 307 | Industrials | 6 | 0.072203 | 0.433853 | 0.024239 | 0.212882 | 0.045929 | 0.364288 | 0.591453 | 0.598257 | 0.142328 | 0.028473 | 0.376193 |
| 308 | Industrials | 6 | 0.072077 | 0.540166 | 0.056210 | 0.029476 | 0.032359 | 0.401539 | 0.649072 | 0.628268 | 0.137778 | 0.015391 | 0.306657 |
| 309 | Health Care | 5 | 0.060057 | 0.551744 | 0.078422 | 0.025109 | 0.063674 | 0.350438 | 0.499051 | 0.587025 | 0.011737 | 0.031667 | 0.417141 |
| 310 | Energy | 3 | 0.052115 | 0.630949 | 0.232337 | 0.019651 | 0.058455 | 0.363850 | 0.573650 | 0.621799 | 0.076828 | 0.011242 | 0.445363 |
| 311 | Materials | 8 | 0.071211 | 0.520237 | 0.289208 | 0.004367 | 0.084551 | 0.355023 | 0.495084 | 0.564831 | 0.017217 | 0.103361 | 0.369683 |
| 312 | Real Estate | 9 | 0.075139 | 0.559370 | 0.074492 | 0.010917 | 0.049061 | 0.370488 | 0.506325 | 0.582352 | 0.029841 | 0.047142 | 0.365707 |
| 313 | Industrials | 6 | 0.056972 | 0.447059 | 0.187387 | 0.039301 | 0.010438 | 0.353653 | 0.501054 | 0.577500 | 0.022451 | 0.042100 | 0.304613 |
| 314 | Information Technology | 7 | 0.065221 | 0.690823 | 0.168011 | 0.037118 | 0.132568 | 0.351716 | 0.498296 | 0.579477 | 0.014088 | 0.044977 | 0.390848 |
| 315 | Health Care | 5 | 0.095502 | 0.675837 | 0.447992 | 0.063319 | 0.230689 | 0.353356 | 0.478876 | 0.529158 | 0.034765 | 0.069829 | 0.383455 |
| 316 | Real Estate | 9 | 0.040875 | 0.463321 | 0.185023 | 0.003275 | 0.049061 | 0.350500 | 0.499212 | 0.561236 | 0.049749 | 0.079699 | 0.351287 |
| 317 | Telecommunications Services | 10 | 0.032839 | 0.522669 | 0.028446 | 0.117904 | 0.014614 | 0.158889 | 0.863185 | 0.589271 | 0.661210 | 0.014506 | 0.242409 |
| 318 | Health Care | 5 | 0.102389 | 0.597516 | 0.080962 | 0.024017 | 0.443633 | 0.352605 | 0.500251 | 0.601132 | 0.008908 | 0.039373 | 0.514984 |
| 319 | Utilities | 11 | 0.036845 | 0.441794 | 0.096148 | 0.006550 | 0.002088 | 0.350178 | 0.503821 | 0.571120 | 0.039735 | 0.035814 | 0.361959 |
| 320 | Financials | 4 | 0.039246 | 0.515384 | 0.061507 | 0.012009 | 0.103340 | 0.336169 | 0.967730 | 0.587474 | 0.888731 | 0.019176 | 0.366409 |
| 321 | Consumer Discretionary | 1 | 0.112063 | 0.458975 | 0.432725 | 0.017467 | 0.010438 | 0.342612 | 0.506796 | 0.639321 | 0.008321 | 0.022520 | 0.151244 |
| 322 | Industrials | 6 | 0.038467 | 0.530340 | 0.053863 | 0.014192 | 0.002088 | 0.310897 | 0.506171 | 0.564831 | 0.069467 | 0.055637 | 0.364082 |
| 323 | Energy | 3 | 0.016687 | 0.157969 | 0.776317 | 0.008734 | 0.004175 | 0.346178 | 0.478570 | 0.543086 | 0.118018 | 0.171688 | 0.300014 |
| 324 | Information Technology | 7 | 0.010555 | 0.435692 | 0.140344 | 0.064410 | 0.016701 | 0.335941 | 0.507938 | 0.564561 | 0.079313 | 0.015335 | 0.331777 |
| 325 | Real Estate | 9 | 0.020056 | 0.544857 | 0.157245 | 0.009825 | 0.121086 | 0.332791 | 0.501021 | 0.557912 | 0.088210 | 0.058560 | 0.382116 |
| 326 | Consumer Discretionary | 1 | 0.053642 | 0.471032 | 0.155647 | 0.068777 | 0.009395 | 0.350181 | 0.503231 | 0.596460 | 0.014755 | 0.021119 | 0.321025 |
| 327 | Consumer Discretionary | 1 | 0.050919 | 0.749904 | 0.795871 | 0.188865 | 0.206681 | 0.347364 | 0.494544 | 0.567257 | 0.011989 | 0.062682 | 0.432855 |
| 328 | Energy | 3 | 0.066811 | 0.320277 | 0.432761 | 0.092795 | 0.198330 | 0.362239 | 0.440255 | 0.317010 | 0.010644 | 0.171688 | 0.406003 |
| 329 | Utilities | 11 | 0.024724 | 0.474774 | 0.073277 | 0.008734 | 0.002088 | 0.350724 | 0.510996 | 0.567347 | 0.078249 | 0.029661 | 0.359956 |
| 330 | Financials | 4 | 0.027297 | 0.536558 | 0.067028 | 0.009825 | 0.103340 | 0.373528 | 0.515521 | 0.587834 | 0.041923 | 0.012091 | 0.333147 |
| 331 | Energy | 3 | 0.057814 | 0.497024 | 0.165562 | 0.008734 | 0.007307 | 0.322063 | 0.827142 | 0.584509 | 0.679614 | 0.032967 | 0.357790 |
| 332 | Health Care | 5 | 0.044354 | 0.656002 | 0.071244 | 0.010917 | 0.062630 | 0.354717 | 0.495710 | 0.565999 | 0.018374 | 0.059148 | 0.375148 |
| 333 | Information Technology | 7 | 0.004825 | 0.553961 | 0.294659 | 0.004367 | 0.027140 | 0.349212 | 0.500354 | 0.553689 | 0.179545 | 0.042609 | 0.369538 |
| 334 | Industrials | 6 | 0.025188 | 0.568983 | 0.112597 | 0.016376 | 0.086639 | 0.351088 | 0.497561 | 0.566807 | 0.024982 | 0.031383 | 0.391111 |
| 335 | Information Technology | 7 | 0.022638 | 0.606935 | 0.289062 | 0.015284 | 0.479123 | 0.318273 | 0.399602 | 0.508222 | 0.148702 | 0.049592 | 0.401498 |
| 336 | Consumer Discretionary | 1 | 0.037795 | 0.376104 | 0.193849 | 0.153930 | 0.028184 | 0.355530 | 0.517428 | 0.576602 | 0.066488 | 0.028084 | 0.352274 |
| 337 | Health Care | 5 | 0.077209 | 0.552717 | 0.174438 | 0.000000 | 0.104384 | 0.362317 | 0.493538 | 0.556923 | 0.026223 | 0.244885 | 0.254575 |
| 338 | Financials | 4 | 0.017946 | 0.449897 | 0.191067 | 0.003275 | 0.103340 | 0.349192 | 0.496925 | 0.560697 | 0.037546 | 0.037735 | 0.370673 |
| 339 | Health Care | 5 | 0.034177 | 0.624463 | 0.228009 | 0.033843 | 0.067850 | 0.359064 | 0.497540 | 0.556025 | 0.076792 | 0.128613 | 0.379378 |
# Plot new results for each sector, using only the top variables and our dictionary from earlier
#normalized_numerical_data_df is the scaled dataframe we created earlier as part of final_normalized_df_eda
cols_to_include = ['GICS Sector Int', 'Net Income', 'Estimated Shares Outstanding',
'Earnings Per Share', 'Current Price', 'P/E Ratio',
'P/B Ratio', 'Volatility', 'ROE']
# Create an empty DataFrame to store correlations
all_correlations = []
# Plot new results by sector
for friendly_name, rank in rank_mapping.items():
# Filter the DataFrame for the current sector
df_temp_corr = final_normalized_df_eda[cols_to_include]
df_temp_corr = df_temp_corr[df_temp_corr['GICS Sector Int'] == rank]
# Calculate the correlation matrix
corr_matrix = df_temp_corr.corr()
# Store the correlation matrix in a list with sector information
corr_df = corr_matrix.stack().reset_index()
corr_df.columns = ['Variable1', 'Variable2', 'Correlation']
corr_df['Sector'] = friendly_name # Add the sector name
# Append to the list
all_correlations.append(corr_df)
# Plot the heatmap
plt.figure(figsize=(15, 7))
sns.heatmap(corr_matrix, annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral")
plt.title(f'{friendly_name}: {rank}')
plt.show()
# Concatenate all correlation DataFrames into one
all_correlations_df = pd.concat(all_correlations, ignore_index=True)
# Filter for correlations greater than 0.75 (and not equal to 1)
high_corrs = all_correlations_df[(all_correlations_df['Correlation'] > 0.75) & (all_correlations_df['Variable1'] != all_correlations_df['Variable2'])]
ANSWER
In our initial correlation observations earlier, we visualized the correlation of the variables in both their native and scaled form. The scaled version normalizes the values of each feature and gives us a better picture of correlations.
Summary of prior results, for a correlation scale of 0 to 1:
Only a few of the variables show significant correlations (above .25 or below -.25).
For the scaled data, it's worth nothing that Net Income is present in both the top 2 positive and negative correlations. We can infer that Net Income has a major effect on other aspects of a stock ticker.
These series all show a positive correlation of greater than .25:
These series all show a negative correlation of less than -.25:
When we examine correlations by sector using our scaled data, we can observe some interesting information.
Cash ratio provides a measure of a company's ability to cover its short-term obligations using only cash and cash equivalents.
# Create pivot table for cash ratio by sector
df_eda_sector_cash_ratio_pivot = df_eda.pivot_table(
index='GICS Sector',
values='Cash Ratio',
aggfunc='mean'
)
# Sort the DataFrame by mean in descending order
df_eda_sector_cash_ratio_pivot.sort_values(by='Cash Ratio', ascending=False, inplace=True)
# Display the combined DataFrame
display(df_eda_sector_cash_ratio_pivot)
| Cash Ratio | |
|---|---|
| GICS Sector | |
| Information Technology | 149.818182 |
| Telecommunications Services | 117.000000 |
| Health Care | 103.775000 |
| Financials | 98.591837 |
| Consumer Staples | 70.947368 |
| Energy | 51.133333 |
| Real Estate | 50.111111 |
| Consumer Discretionary | 49.575000 |
| Materials | 41.700000 |
| Industrials | 36.188679 |
| Utilities | 13.625000 |
# Visualize cash ratio
plt.figure(figsize=(10, 5))
sns.barplot(data=df_eda.sort_values(by='Cash Ratio', ascending=False), x='GICS Sector', y='Cash Ratio', hue='GICS Sector')
plt.xticks(rotation=90)
plt.show()
ANSWER
As a whole, the average Cash Ratio is 70.02, with a median of 47.0. We can further examine by sector.
The top five sectors all have an average cash ratio above the average Cash Ratio:
The bottom three sectors for average cash ratio are:
The widest ranges of Cash Ratios exist in the Telecommunications Services and Information Technology sectors. The highest levels of Cash Ratios exist in these sectors as well, and companies within this sector have a strong liquidity position, meaning they can easily cover their short-term obligations.
Conversely, Utilities, Industrials and Real Estate have the tightest Cash Ratio ranges. Liquidity is low in these industries.
P/E ratios can help determine the relative value of a company's shares as they signify the amount of money an investor is willing to invest in a single share of a company per dollar of its earnings.
# Create pivot table for P/E Ratio by sector
df_eda_sector_pe_ratio_pivot = df_eda.pivot_table(
index='GICS Sector',
values='P/E Ratio',
aggfunc='mean'
)
# Sort the DataFrame by mean in descending order
df_eda_sector_pe_ratio_pivot.sort_values(by='P/E Ratio', ascending=False, inplace=True)
# Display the combined DataFrame
display(df_eda_sector_pe_ratio_pivot)
| P/E Ratio | |
|---|---|
| GICS Sector | |
| Energy | 72.897709 |
| Information Technology | 43.782546 |
| Real Estate | 43.065585 |
| Health Care | 41.135272 |
| Consumer Discretionary | 35.211613 |
| Consumer Staples | 25.521195 |
| Materials | 24.585352 |
| Utilities | 18.719412 |
| Industrials | 18.259380 |
| Financials | 16.023151 |
| Telecommunications Services | 12.222578 |
# Visualize P/E ratio
plt.figure(figsize=(10, 5))
sns.barplot(data=df_eda.sort_values(by='P/E Ratio', ascending=False), x='GICS Sector', y='P/E Ratio', hue='GICS Sector')
plt.xticks(rotation=90)
plt.show()
ANSWER
The average P/E Ratio across all sectors is 32.61, with a median of 20.82. Again, we can further examine by sector.
Five sectors all have an average P/E ratio above the average, and companies are holding onto cash longer. The market is likely bullish on these industries and expect future growth and/or profitablity.
The bottom three sectors for average P/E ratio are as follows. These sectors are likely not expected to grow much or generate much profit, and do not maintain large cash reserves.
The widest range of P/E Ratios exist in the Information Technology and Consumer Discretionary sectors.
The tightest range of P/E ratios exist in the Industrials, Financials, Utilities and Telecommunications Services sectors.
The goal of our data analysis is to group stocks into similar clusters based on the data attributes we have in our dataset, so that we can define the characteristics of each group.
To do this, we need to:
# Import additional libraries
from scipy.spatial.distance import cdist
# Library to compute distances
from scipy.spatial.distance import cdist, pdist
# Library to perform k-means clustering and compute silhouette scores
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
# Library for principal component analysis
from sklearn.decomposition import PCA
# Library for scaling; RobustScaler is supposed to work better for datasets with outliers
from sklearn.preprocessing import RobustScaler
# Libraries for visualization
from yellowbrick.cluster import KElbowVisualizer, SilhouetteVisualizer
from sklearn.manifold import TSNE
# Library to perform hierarchical clustering, compute cophenetic correlation, and create dendrograms
from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import dendrogram, linkage, cophenet
# Copy the original dataframe before we begin
df_model = df.copy()
We know from earlier that we are not missing any values, but let's double check.
# Check data types
df_model.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 340 entries, 0 to 339 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Ticker Symbol 340 non-null object 1 Security 340 non-null object 2 GICS Sector 340 non-null object 3 GICS Sub Industry 340 non-null object 4 Current Price 340 non-null float64 5 Price Change 340 non-null float64 6 Volatility 340 non-null float64 7 ROE 340 non-null int64 8 Cash Ratio 340 non-null int64 9 Net Cash Flow 340 non-null int64 10 Net Income 340 non-null int64 11 Earnings Per Share 340 non-null float64 12 Estimated Shares Outstanding 340 non-null float64 13 P/E Ratio 340 non-null float64 14 P/B Ratio 340 non-null float64 dtypes: float64(7), int64(4), object(4) memory usage: 40.0+ KB
# Describe the data. Can also use include=all to also show categorical series.
df_model.describe(include='all').T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ticker Symbol | 340 | 340 | AAL | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Security | 340 | 340 | American Airlines Group | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sector | 340 | 11 | Industrials | 53 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sub Industry | 340 | 104 | Oil & Gas Exploration & Production | 16 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Current Price | 340.0 | NaN | NaN | NaN | 80.862345 | 98.055086 | 4.5 | 38.555 | 59.705 | 92.880001 | 1274.949951 |
| Price Change | 340.0 | NaN | NaN | NaN | 4.078194 | 12.006338 | -47.129693 | -0.939484 | 4.819505 | 10.695493 | 55.051683 |
| Volatility | 340.0 | NaN | NaN | NaN | 1.525976 | 0.591798 | 0.733163 | 1.134878 | 1.385593 | 1.695549 | 4.580042 |
| ROE | 340.0 | NaN | NaN | NaN | 39.597059 | 96.547538 | 1.0 | 9.75 | 15.0 | 27.0 | 917.0 |
| Cash Ratio | 340.0 | NaN | NaN | NaN | 70.023529 | 90.421331 | 0.0 | 18.0 | 47.0 | 99.0 | 958.0 |
| Net Cash Flow | 340.0 | NaN | NaN | NaN | 55537620.588235 | 1946365312.175789 | -11208000000.0 | -193906500.0 | 2098000.0 | 169810750.0 | 20764000000.0 |
| Net Income | 340.0 | NaN | NaN | NaN | 1494384602.941176 | 3940150279.327937 | -23528000000.0 | 352301250.0 | 707336000.0 | 1899000000.0 | 24442000000.0 |
| Earnings Per Share | 340.0 | NaN | NaN | NaN | 2.776662 | 6.587779 | -61.2 | 1.5575 | 2.895 | 4.62 | 50.09 |
| Estimated Shares Outstanding | 340.0 | NaN | NaN | NaN | 577028337.75403 | 845849595.417695 | 27672156.86 | 158848216.1 | 309675137.8 | 573117457.325 | 6159292035.0 |
| P/E Ratio | 340.0 | NaN | NaN | NaN | 32.612563 | 44.348731 | 2.935451 | 15.044653 | 20.819876 | 31.764755 | 528.039074 |
| P/B Ratio | 340.0 | NaN | NaN | NaN | -1.718249 | 13.966912 | -76.119077 | -4.352056 | -1.06717 | 3.917066 | 129.064585 |
Our dataset is very small (340 rows of data). Below, we can observe quite a few outliers. However, eliminating these would remove a significant amount of data. In addition, the top outlier in each area isn't consistently a single security, so we know that the most extreme outliers are not all the same security.
For these reasons, we will not remove any outliers. However, since we can observe a high number of outliers, we should carefully consider which clustering algorithms we select.
# Check each feature to see if we have any single ticker/row as an outlier in all features
features = df_model.columns.difference(['Security'])
# Loop through each feature and print the Ticker Name with the maximum value
for feature in features:
max_value_row = df_model.loc[df_model[feature].idxmax()]
min_value_row = df_model.loc[df_model[feature].idxmin()]
print(f'**{feature}**')
print(f' Max: {max_value_row["Security"]} ({max_value_row[feature]})')
print(f' Min: {min_value_row["Security"]} ({min_value_row[feature]})')
**Cash Ratio** Max: Facebook (958) Min: Charter Communications (0) **Current Price** Max: Priceline.com Inc (1274.949951) Min: Chesapeake Energy (4.5) **Earnings Per Share** Max: Priceline.com Inc (50.09) Min: Apache Corporation (-61.2) **Estimated Shares Outstanding** Max: Pfizer Inc. (6159292035.0) Min: Mettler Toledo (27672156.86) **GICS Sector** Max: Ameren Corp (Utilities) Min: Amazon.com Inc (Consumer Discretionary) **GICS Sub Industry** Max: American Water Works Company Inc (Water Utilities) Min: Interpublic Group (Advertising) **Net Cash Flow** Max: Bank of America Corp (20764000000) Min: Citigroup Inc. (-11208000000) **Net Income** Max: JPMorgan Chase & Co. (24442000000) Min: Apache Corporation (-23528000000) **P/B Ratio** Max: Alliance Data Systems (129.0645854) Min: Charter Communications (-76.11907749) **P/E Ratio** Max: Amazon.com Inc (528.0390742) Min: United Continental Holdings (2.935450768) **Price Change** Max: First Solar Inc (55.05168339) Min: Kinder Morgan (-47.12969338) **ROE** Max: Apache Corporation (917) Min: Concho Resources (1) **Ticker Symbol** Max: Zoetis (ZTS) Min: American Airlines Group (AAL) **Volatility** Max: Southwestern Energy (4.58004173) Min: McDonald's Corp. (0.733163184)
# Create tight boxplots to show outliers
plt.figure(figsize=(12, 10))
num_cols = df_model.select_dtypes(include=np.number).columns.tolist()
for i, col in enumerate(num_cols):
plt.subplot(3, 4, i + 1)
plt.boxplot(df_model[col], whis=1.5)
plt.tight_layout()
plt.title(col)
plt.show()
df_model.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 340 entries, 0 to 339 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Ticker Symbol 340 non-null object 1 Security 340 non-null object 2 GICS Sector 340 non-null object 3 GICS Sub Industry 340 non-null object 4 Current Price 340 non-null float64 5 Price Change 340 non-null float64 6 Volatility 340 non-null float64 7 ROE 340 non-null int64 8 Cash Ratio 340 non-null int64 9 Net Cash Flow 340 non-null int64 10 Net Income 340 non-null int64 11 Earnings Per Share 340 non-null float64 12 Estimated Shares Outstanding 340 non-null float64 13 P/E Ratio 340 non-null float64 14 P/B Ratio 340 non-null float64 dtypes: float64(7), int64(4), object(4) memory usage: 40.0+ KB
# One-hot encode variables for Sector and Sub Sector: Determined we don't want to do this
#df_model_clean = pd.get_dummies(df_model_clean, columns=['GICS Sector'])
#df_model_clean = pd.get_dummies(df_model_clean, columns=['GICS Sub Industry'])
# Drop unique values: Determined we don't want to do this
#df_model_clean = df_model_clean.drop(['Ticker Symbol', 'Security'], axis=1)
df_model.describe(include='all').T
| count | unique | top | freq | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ticker Symbol | 340 | 340 | AAL | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Security | 340 | 340 | American Airlines Group | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sector | 340 | 11 | Industrials | 53 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| GICS Sub Industry | 340 | 104 | Oil & Gas Exploration & Production | 16 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| Current Price | 340.0 | NaN | NaN | NaN | 80.862345 | 98.055086 | 4.5 | 38.555 | 59.705 | 92.880001 | 1274.949951 |
| Price Change | 340.0 | NaN | NaN | NaN | 4.078194 | 12.006338 | -47.129693 | -0.939484 | 4.819505 | 10.695493 | 55.051683 |
| Volatility | 340.0 | NaN | NaN | NaN | 1.525976 | 0.591798 | 0.733163 | 1.134878 | 1.385593 | 1.695549 | 4.580042 |
| ROE | 340.0 | NaN | NaN | NaN | 39.597059 | 96.547538 | 1.0 | 9.75 | 15.0 | 27.0 | 917.0 |
| Cash Ratio | 340.0 | NaN | NaN | NaN | 70.023529 | 90.421331 | 0.0 | 18.0 | 47.0 | 99.0 | 958.0 |
| Net Cash Flow | 340.0 | NaN | NaN | NaN | 55537620.588235 | 1946365312.175789 | -11208000000.0 | -193906500.0 | 2098000.0 | 169810750.0 | 20764000000.0 |
| Net Income | 340.0 | NaN | NaN | NaN | 1494384602.941176 | 3940150279.327937 | -23528000000.0 | 352301250.0 | 707336000.0 | 1899000000.0 | 24442000000.0 |
| Earnings Per Share | 340.0 | NaN | NaN | NaN | 2.776662 | 6.587779 | -61.2 | 1.5575 | 2.895 | 4.62 | 50.09 |
| Estimated Shares Outstanding | 340.0 | NaN | NaN | NaN | 577028337.75403 | 845849595.417695 | 27672156.86 | 158848216.1 | 309675137.8 | 573117457.325 | 6159292035.0 |
| P/E Ratio | 340.0 | NaN | NaN | NaN | 32.612563 | 44.348731 | 2.935451 | 15.044653 | 20.819876 | 31.764755 | 528.039074 |
| P/B Ratio | 340.0 | NaN | NaN | NaN | -1.718249 | 13.966912 | -76.119077 | -4.352056 | -1.06717 | 3.917066 | 129.064585 |
We observed earlier that we have a high outlier count, and also major discrepancies of scale in our features. When we examined correlations, we normalized the data. We will need to similar do this to our base model data.
We can choose between several different types of scaling - MinMaxScaler, Z-score, or Robust. Since Robust works well with datasets containing outliers, we will use it.
# Initialize the RobustScaler
scaler = RobustScaler()
subset = df_model.select_dtypes(include=['number']).copy()
# Scale the data
subset_scaled = scaler.fit_transform(subset)
# Convert to DataFrame
df_model_scaled = pd.DataFrame(subset_scaled, columns=subset.columns)
Next, let's see which clustering technique works best for our scaled dataset. To do this, we will first run PCA, and then use two clustering techniques.
First, we will attempt to visualize various k values to determine which one we want to use moving forward. To do this we will leverage the elbow method and visualize the distortion for each k value.
# Measure execution time
%%time
# Identify desired cluster count
df_k_means = df_model_scaled.copy()
clusters = range(1, 16)
meanDistortions = []
# Loop through cluster values
for k in clusters:
model_k_means = KMeans(n_clusters=k)
model_k_means.fit(df_k_means)
prediction = model_k_means.predict(df_k_means)
distortion = (
sum(
np.min(cdist(df_k_means, model_k_means.cluster_centers_, 'euclidean'), axis=1)
)
/ df_k_means.shape[0]
)
meanDistortions.append(distortion)
print('# of Clusters:', k, '\tAverage Distortion:', distortion)
plt.plot(clusters, meanDistortions, 'bx-')
plt.xlabel('k')
plt.ylabel('Avg Distortion')
plt.title('Selecting k with Elbow Method')
plt.show()
# of Clusters: 1 Average Distortion: 6.165015784179755 # of Clusters: 2 Average Distortion: 5.519913922135517 # of Clusters: 3 Average Distortion: 5.301175199731372 # of Clusters: 4 Average Distortion: 5.128821925696185 # of Clusters: 5 Average Distortion: 4.852367478608126 # of Clusters: 6 Average Distortion: 4.564025418031926 # of Clusters: 7 Average Distortion: 4.454884657269399 # of Clusters: 8 Average Distortion: 4.221774350762034 # of Clusters: 9 Average Distortion: 4.191296800846647 # of Clusters: 10 Average Distortion: 4.079244586736443 # of Clusters: 11 Average Distortion: 3.829791083252664 # of Clusters: 12 Average Distortion: 3.753949415523762 # of Clusters: 13 Average Distortion: 3.6369603635468146 # of Clusters: 14 Average Distortion: 3.596860202066002 # of Clusters: 15 Average Distortion: 3.445799561982341
CPU times: user 409 ms, sys: 112 ms, total: 521 ms Wall time: 488 ms
# Fit the model to the visualizer
model_k_means_2 = KMeans(random_state=1)
visualizer = KElbowVisualizer(model_k_means_2, k=(1, 16), timings=True)
visualizer.fit(df_k_means)
visualizer.show()
<Axes: title={'center': 'Distortion Score Elbow for KMeans Clustering'}, xlabel='k', ylabel='distortion score'>
# Check silhouette scores
sil_score = []
cluster_list = range(2, 16)
for n_clusters in cluster_list:
clusterer = KMeans(n_clusters=n_clusters, random_state=1)
preds = clusterer.fit_predict((df_k_means))
score = silhouette_score(df_k_means, preds)
sil_score.append(score)
print('n_clusters={}: Silhouette score={})'.format(n_clusters, score))
plt.plot(cluster_list, sil_score)
plt.show()
n_clusters=2: Silhouette score=0.7757324276239033) n_clusters=3: Silhouette score=0.7454764840420494) n_clusters=4: Silhouette score=0.4621955535259713) n_clusters=5: Silhouette score=0.4829025199901958) n_clusters=6: Silhouette score=0.4813819928027789) n_clusters=7: Silhouette score=0.49576783422468745) n_clusters=8: Silhouette score=0.411864134740954) n_clusters=9: Silhouette score=0.4012633404812339) n_clusters=10: Silhouette score=0.3763626634759577) n_clusters=11: Silhouette score=0.3505016857108297) n_clusters=12: Silhouette score=0.3613349297613386) n_clusters=13: Silhouette score=0.3806084017704855) n_clusters=14: Silhouette score=0.3183846743555661) n_clusters=15: Silhouette score=0.27805939414564607)
# Visualize the silhouette score for various values between 5 and 8
for i in range(5, 9):
visualizer = SilhouetteVisualizer(KMeans(i, random_state=1))
visualizer.fit(df_k_means)
visualizer.show()
# Run again for the ideal cluster count for k-means
kmeans = KMeans(n_clusters=5, random_state=1)
kmeans.fit(df_model_scaled)
KMeans(n_clusters=5, random_state=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
KMeans(n_clusters=5, random_state=1)
# Apply the k-means to a final model
df_kmean = df_model.copy()
# Add labels
df_kmean['Cluster'] = kmeans.labels_
df_model_scaled['Cluster'] = kmeans.labels_
# Show counts by cluster
display(df_kmean['Cluster'].value_counts())
| count | |
|---|---|
| Cluster | |
| 3 | 303 |
| 1 | 16 |
| 0 | 12 |
| 2 | 7 |
| 4 | 2 |
# Show the securities in each cluster
for cluster in df_kmean['Cluster'].unique():
cluster_securities = df_kmean[df_kmean['Cluster'] == cluster]['Security']
print(f'Cluster {cluster}: {", ".join(cluster_securities)}')
Cluster 1: American Airlines Group, AbbVie, Devon Energy Corp., Freeport-McMoran Cp & Gld, IDEXX Laboratories, Masco Corp., Moody's Corp, Altria Group Inc, Newfield Exploration Co, Pitney-Bowes, Sherwin-Williams, Southwestern Energy, United Parcel Service, Wynn Resorts Ltd, Cimarex Energy, Yum! Brands Inc Cluster 3: Abbott Laboratories, Adobe Systems Inc, Analog Devices, Inc., Archer-Daniels-Midland Co, Alliance Data Systems, Ameren Corp, American Electric Power, AFLAC Inc, American International Group, Inc., Apartment Investment & Mgmt, Assurant Inc, Arthur J. Gallagher & Co., Akamai Technologies Inc, Albemarle Corp, Alaska Air Group Inc, Allstate Corp, Alexion Pharmaceuticals, Applied Materials Inc, AMETEK Inc, Affiliated Managers Group Inc, Amgen Inc, Ameriprise Financial, American Tower Corp A, Amazon.com Inc, AutoNation Inc, Anthem Inc., Aon plc, Amphenol Corp, Arconic Inc, Activision Blizzard, AvalonBay Communities, Inc., Broadcom, American Water Works Company Inc, American Express Co, Boeing Company, Baxter International Inc., BB&T Corporation, Bard (C.R.) Inc., Baker Hughes Inc, BIOGEN IDEC Inc., The Bank of New York Mellon Corp., Ball Corp, Boston Scientific, BorgWarner, Boston Properties, Caterpillar Inc., Chubb Limited, CBRE Group, Crown Castle International Corp., Carnival Corp., Celgene Corp., CF Industries Holdings Inc, Citizens Financial Group, Church & Dwight, C. H. Robinson Worldwide, CIGNA Corp., Cincinnati Financial, Comerica Inc., CME Group Inc., Chipotle Mexican Grill, Cummins Inc., CMS Energy, Centene Corporation, CenterPoint Energy, Capital One Financial, Cabot Oil & Gas, The Cooper Companies, CSX Corp., CenturyLink Inc, Cognizant Technology Solutions, Citrix Systems, CVS Health, Chevron Corp., Concho Resources, Dominion Resources, Delta Air Lines, Du Pont (E.I.), Deere & Co., Discover Financial Services, Quest Diagnostics, Danaher Corp., The Walt Disney Company, Discovery Communications-A, Discovery Communications-C, Delphi Automotive, Digital Realty Trust, Dun & Bradstreet, Dover Corp., Dr Pepper Snapple Group, Duke Energy, DaVita Inc., Ecolab Inc., Consolidated Edison, Equifax Inc., Edison Int'l, Eastman Chemical, EOG Resources, Equinix, Equity Residential, EQT Corporation, Eversource Energy, Essex Property Trust, Inc., E*Trade, Eaton Corporation, Entergy Corp., Edwards Lifesciences, Exelon Corp., Expeditors Int'l, Expedia Inc., Extra Space Storage, Ford Motor, Fastenal Co, Facebook, Fortune Brands Home & Security, FirstEnergy Corp, Fidelity National Information Services, Fiserv Inc, FLIR Systems, Fluor Corp., Flowserve Corporation, FMC Corporation, Federal Realty Investment Trust, First Solar Inc, Frontier Communications, General Dynamics, General Growth Properties Inc., Gilead Sciences, Corning Inc., Genuine Parts, Garmin Ltd., Goodyear Tire & Rubber, Grainger (W.W.) Inc., Halliburton Co., Hasbro Inc., Huntington Bancshares, HCA Holdings, Welltower Inc., HCP Inc., Hess Corporation, Hartford Financial Svc.Gp., Harley-Davidson, Honeywell Int'l Inc., Hewlett Packard Enterprise, HP Inc., Hormel Foods Corp., Henry Schein, Host Hotels & Resorts, The Hershey Company, Humana Inc., International Business Machines, Intl Flavors & Fragrances, International Paper, Interpublic Group, Iron Mountain Incorporated, Intuitive Surgical Inc., Illinois Tool Works, Invesco Ltd., J. B. Hunt Transport Services, Jacobs Engineering Group, Juniper Networks, Kimco Realty, Kinder Morgan, Kansas City Southern, Leggett & Platt, Lennar Corp., Laboratory Corp. of America Holding, LKQ Corporation, L-3 Communications Holdings, Lilly (Eli) & Co., Lockheed Martin Corp., Alliant Energy Corp, Leucadia National Corp., Southwest Airlines, Level 3 Communications, LyondellBasell, Mastercard Inc., Mid-America Apartments, Macerich, Marriott Int'l., Mattel Inc., McDonald's Corp., Mondelez International, MetLife Inc., Mohawk Industries, Mead Johnson, McCormick & Co., Martin Marietta Materials, Marsh & McLennan, 3M Company, Monster Beverage, The Mosaic Company, Marathon Petroleum, Merck & Co., Marathon Oil Corp., M&T Bank Corp., Mettler Toledo, Murphy Oil, Mylan N.V., Navient, Noble Energy Inc, NASDAQ OMX Group, NextEra Energy, Newmont Mining Corp. (Hldg. Co.), Netflix Inc., Nielsen Holdings, National Oilwell Varco Inc., Norfolk Southern Corp., Northern Trust Corp., Nucor Corp., Newell Brands, Realty Income Corporation, ONEOK, Omnicom Group, O'Reilly Automotive, Occidental Petroleum, People's United Financial, PACCAR Inc., PG&E Corp., Priceline.com Inc, Public Serv. Enterprise Inc., PepsiCo Inc., Principal Financial Group, Procter & Gamble, Progressive Corp., Pulte Homes Inc., Philip Morris International, PNC Financial Services, Pentair Ltd., Pinnacle West Capital, PPG Industries, PPL Corp., Prudential Financial, Phillips 66, Quanta Services Inc., Praxair Inc., PayPal, Ryder System, Royal Caribbean Cruises Ltd, Regeneron, Robert Half International, Roper Industries, Range Resources Corp., Republic Services Inc, SCANA Corp, Charles Schwab Corporation, Spectra Energy Corp., Sealed Air, SL Green Realty, Scripps Networks Interactive Inc., Southern Co., Simon Property Group Inc, Stericycle Inc, Sempra Energy, SunTrust Banks, State Street Corp., Skyworks Solutions, Synchrony Financial, Stryker Corp., Molson Coors Brewing Company, Teradata Corp., Tegna, Inc., Torchmark Corp., Thermo Fisher Scientific, TripAdvisor, The Travelers Companies Inc., Tractor Supply Company, Tyson Foods, Tesoro Petroleum Co., Total System Services, Texas Instruments, Under Armour, United Continental Holdings, UDR Inc, Universal Health Services, Inc., United Health Group Inc., Unum Group, Union Pacific, United Technologies, Varian Medical Systems, Valero Energy, Vulcan Materials, Vornado Realty Trust, Verisk Analytics, Verisign Inc., Vertex Pharmaceuticals Inc, Ventas Inc, Waters Corporation, Wec Energy Group Inc, Whirlpool Corp., Waste Management Inc., Williams Cos., Western Union Co, Weyerhaeuser Corp., Wyndham Worldwide, Xcel Energy Inc, XL Capital, Dentsply Sirona, Xerox Corp., Xylem Inc., Yahoo Inc., Zimmer Biomet Holdings, Zions Bancorp, Zoetis Cluster 2: Allegion, Apache Corporation, Chesapeake Energy, Charter Communications, Colgate-Palmolive, Kimberly-Clark, S&P Global, Inc. Cluster 0: Anadarko Petroleum Corp, Bristol-Myers Squibb, Citigroup Inc., eBay Inc., General Motors, JPMorgan Chase & Co., Coca Cola Company, Pfizer Inc., AT&T Inc, Verizon Communications, Wells Fargo, Exxon Mobil Corp. Cluster 4: Bank of America Corp, Intel Corp.
# Create a crosstab for sectors and clusters
df_kmean_crosstab_with_totals = pd.crosstab(df_kmean['Cluster'], df_kmean['GICS Sector'], margins=True, margins_name="Total")
# Display the crosstab with totals
df_kmean_crosstab_with_totals.T # Transpose to switch rows/columns if needed
| Cluster | 0 | 1 | 2 | 3 | 4 | Total |
|---|---|---|---|---|---|---|
| GICS Sector | ||||||
| Consumer Discretionary | 1 | 2 | 1 | 36 | 0 | 40 |
| Consumer Staples | 1 | 1 | 2 | 15 | 0 | 19 |
| Energy | 2 | 4 | 2 | 22 | 0 | 30 |
| Financials | 3 | 1 | 1 | 43 | 1 | 49 |
| Health Care | 2 | 2 | 0 | 36 | 0 | 40 |
| Industrials | 0 | 4 | 1 | 48 | 0 | 53 |
| Information Technology | 1 | 0 | 0 | 31 | 1 | 33 |
| Materials | 0 | 2 | 0 | 18 | 0 | 20 |
| Real Estate | 0 | 0 | 0 | 27 | 0 | 27 |
| Telecommunications Services | 2 | 0 | 0 | 3 | 0 | 5 |
| Utilities | 0 | 0 | 0 | 24 | 0 | 24 |
| Total | 12 | 16 | 7 | 303 | 2 | 340 |
# Show means for our original columns by Cluster
#df_model_k_means_grouped = df_model_k_means.groupby('Cluster').mean()
# Loop through the selected columns and plot boxplots
for col in list_numcol:
if col in df_kmean.columns: # Ensure the column exists in the DataFrame
plt.figure(figsize=(8, 5))
sns.boxplot(data=df_kmean, x='Cluster', y=col, hue='Cluster', palette='viridis', legend='full')
plt.xticks(rotation=90)
plt.title(f'Boxplot of {col} by Cluster')
plt.show()
OBSERVATION
It appears that a k value of 5 or 6 would work best for our dataset, as it has the highest silhouette score, and also emerged as a key value in the elbow exercise.
After using K-means to predict the clusters on our dataset, we can see the companies break out as follows into our five clusters:
Cluster 0 (12 Companies)
Anadarko Petroleum Corp, Bristol-Myers Squibb, Citigroup Inc., eBay Inc., General Motors, JPMorgan Chase & Co., Coca Cola Company, Pfizer Inc., AT&T Inc, Verizon Communications, Wells Fargo, Exxon Mobil Corp.
Cluster 1 (16 Companies)
American Airlines Group, AbbVie, Devon Energy Corp., Freeport-McMoran Cp & Gld, IDEXX Laboratories, Masco Corp., Moody's Corp, Altria Group Inc, Newfield Exploration Co, Pitney-Bowes, Sherwin-Williams, Southwestern Energy, United Parcel Service, Wynn Resorts Ltd, Cimarex Energy, Yum! Brands Inc
Cluster 2 (7 Companies)
Allegion, Apache Corporation, Chesapeake Energy, Charter Communications, Colgate-Palmolive, Kimberly-Clark, S&P Global, Inc.
Cluster 4 (2 Companies)
Bank of America Corp, Intel Corp.
Cluster 3 (All remaining 303 Companies)
Abbott Laboratories, Adobe Systems Inc, Analog Devices, Inc., Archer-Daniels-Midland Co, Alliance Data Systems, Ameren Corp, American Electric Power, AFLAC Inc, American International Group, Inc., Apartment Investment & Mgmt, Assurant Inc, Arthur J. Gallagher & Co., Akamai Technologies Inc, Albemarle Corp, Alaska Air Group Inc, Allstate Corp, Alexion Pharmaceuticals, Applied Materials Inc, AMETEK Inc, Affiliated Managers Group Inc, Amgen Inc, Ameriprise Financial, American Tower Corp A, Amazon.com Inc, AutoNation Inc, Anthem Inc., Aon plc, Amphenol Corp, Arconic Inc, Activision Blizzard, AvalonBay Communities, Inc., Broadcom, American Water Works Company Inc, American Express Co, Boeing Company, Baxter International Inc., BB&T Corporation, Bard (C.R.) Inc., Baker Hughes Inc, BIOGEN IDEC Inc., The Bank of New York Mellon Corp., Ball Corp, Boston Scientific, BorgWarner, Boston Properties, Caterpillar Inc., Chubb Limited, CBRE Group, Crown Castle International Corp., Carnival Corp., Celgene Corp., CF Industries Holdings Inc, Citizens Financial Group, Church & Dwight, C. H. Robinson Worldwide, CIGNA Corp., Cincinnati Financial, Comerica Inc., CME Group Inc., Chipotle Mexican Grill, Cummins Inc., CMS Energy, Centene Corporation, CenterPoint Energy, Capital One Financial, Cabot Oil & Gas, The Cooper Companies, CSX Corp., CenturyLink Inc, Cognizant Technology Solutions, Citrix Systems, CVS Health, Chevron Corp., Concho Resources, Dominion Resources, Delta Air Lines, Du Pont (E.I.), Deere & Co., Discover Financial Services, Quest Diagnostics, Danaher Corp., The Walt Disney Company, Discovery Communications-A, Discovery Communications-C, Delphi Automotive, Digital Realty Trust, Dun & Bradstreet, Dover Corp., Dr Pepper Snapple Group, Duke Energy, DaVita Inc., Ecolab Inc., Consolidated Edison, Equifax Inc., Edison Int'l, Eastman Chemical, EOG Resources, Equinix, Equity Residential, EQT Corporation, Eversource Energy, Essex Property Trust, Inc., E*Trade, Eaton Corporation, Entergy Corp., Edwards Lifesciences, Exelon Corp., Expeditors Int'l, Expedia Inc., Extra Space Storage, Ford Motor, Fastenal Co, Facebook, Fortune Brands Home & Security, FirstEnergy Corp, Fidelity National Information Services, Fiserv Inc, FLIR Systems, Fluor Corp., Flowserve Corporation, FMC Corporation, Federal Realty Investment Trust, First Solar Inc, Frontier Communications, General Dynamics, General Growth Properties Inc., Gilead Sciences, Corning Inc., Genuine Parts, Garmin Ltd., Goodyear Tire & Rubber, Grainger (W.W.) Inc., Halliburton Co., Hasbro Inc., Huntington Bancshares, HCA Holdings, Welltower Inc., HCP Inc., Hess Corporation, Hartford Financial Svc.Gp., Harley-Davidson, Honeywell Int'l Inc., Hewlett Packard Enterprise, HP Inc., Hormel Foods Corp., Henry Schein, Host Hotels & Resorts, The Hershey Company, Humana Inc., International Business Machines, Intl Flavors & Fragrances, International Paper, Interpublic Group, Iron Mountain Incorporated, Intuitive Surgical Inc., Illinois Tool Works, Invesco Ltd., J. B. Hunt Transport Services, Jacobs Engineering Group, Juniper Networks, Kimco Realty, Kinder Morgan, Kansas City Southern, Leggett & Platt, Lennar Corp., Laboratory Corp. of America Holding, LKQ Corporation, L-3 Communications Holdings, Lilly (Eli) & Co., Lockheed Martin Corp., Alliant Energy Corp, Leucadia National Corp., Southwest Airlines, Level 3 Communications, LyondellBasell, Mastercard Inc., Mid-America Apartments, Macerich, Marriott Int'l., Mattel Inc., McDonald's Corp., Mondelez International, MetLife Inc., Mohawk Industries, Mead Johnson, McCormick & Co., Martin Marietta Materials, Marsh & McLennan, 3M Company, Monster Beverage, The Mosaic Company, Marathon Petroleum, Merck & Co., Marathon Oil Corp., M&T Bank Corp., Mettler Toledo, Murphy Oil, Mylan N.V., Navient, Noble Energy Inc, NASDAQ OMX Group, NextEra Energy, Newmont Mining Corp. (Hldg. Co.), Netflix Inc., Nielsen Holdings, National Oilwell Varco Inc., Norfolk Southern Corp., Northern Trust Corp., Nucor Corp., Newell Brands, Realty Income Corporation, ONEOK, Omnicom Group, O'Reilly Automotive, Occidental Petroleum, People's United Financial, PACCAR Inc., PG&E Corp., Priceline.com Inc, Public Serv. Enterprise Inc., PepsiCo Inc., Principal Financial Group, Procter & Gamble, Progressive Corp., Pulte Homes Inc., Philip Morris International, PNC Financial Services, Pentair Ltd., Pinnacle West Capital, PPG Industries, PPL Corp., Prudential Financial, Phillips 66, Quanta Services Inc., Praxair Inc., PayPal, Ryder System, Royal Caribbean Cruises Ltd, Regeneron, Robert Half International, Roper Industries, Range Resources Corp., Republic Services Inc, SCANA Corp, Charles Schwab Corporation, Spectra Energy Corp., Sealed Air, SL Green Realty, Scripps Networks Interactive Inc., Southern Co., Simon Property Group Inc, Stericycle Inc, Sempra Energy, SunTrust Banks, State Street Corp., Skyworks Solutions, Synchrony Financial, Stryker Corp., Molson Coors Brewing Company, Teradata Corp., Tegna, Inc., Torchmark Corp., Thermo Fisher Scientific, TripAdvisor, The Travelers Companies Inc., Tractor Supply Company, Tyson Foods, Tesoro Petroleum Co., Total System Services, Texas Instruments, Under Armour, United Continental Holdings, UDR Inc, Universal Health Services, Inc., United Health Group Inc., Unum Group, Union Pacific, United Technologies, Varian Medical Systems, Valero Energy, Vulcan Materials, Vornado Realty Trust, Verisk Analytics, Verisign Inc., Vertex Pharmaceuticals Inc, Ventas Inc, Waters Corporation, Wec Energy Group Inc, Whirlpool Corp., Waste Management Inc., Williams Cos., Western Union Co, Weyerhaeuser Corp., Wyndham Worldwide, Xcel Energy Inc, XL Capital, Dentsply Sirona, Xerox Corp., Xylem Inc., Yahoo Inc., Zimmer Biomet Holdings, Zions Bancorp, Zoetis
Next, we will try hierarchical clusterings.
The first thing we will do is figure out which distance metric and linkage type to use.
df_hierarchical = df_model_scaled.copy()
#Measure execution time
%%time
# Create list of distance metrics
distance_metrics = ['euclidean', 'chebyshev', 'mahalanobis', 'cityblock']
# Create list of linkage methods
linkage_methods = ['single', 'complete', 'average', 'centroid', 'ward', 'weighted']
high_cophenet_corr = 0
high_dm_lm = [0, 0]
# Calculate the cophentics for each combination of distance type and linkage method
for dm in distance_metrics:
for lm in linkage_methods:
try:
# Try to compute linkage and cophenet correlation
Z = linkage(df_hierarchical, metric=dm, method=lm)
c, coph_dists = cophenet(Z, pdist(df_hierarchical))
# Print cophenetic correlation result
print(
'Cophenetic correlation for {} distance and {} linkage is {}.'.format(
dm.capitalize(), lm, c
)
)
# Update the highest cophenetic correlation
if high_cophenet_corr < c:
high_cophenet_corr = c
high_dm_lm[0] = dm
high_dm_lm[1] = lm
except Exception as e:
# Not all distance types can be used with all linkage types, so we need to gracefully handle errors
print(f'Error for {dm.capitalize()} distance and {lm} linkage: {e}')
Cophenetic correlation for Euclidean distance and single linkage is 0.9515911691498813. Cophenetic correlation for Euclidean distance and complete linkage is 0.8973300286572553. Cophenetic correlation for Euclidean distance and average linkage is 0.9456134806319734. Cophenetic correlation for Euclidean distance and centroid linkage is 0.9468040178955592. Cophenetic correlation for Euclidean distance and ward linkage is 0.8184834889632843. Cophenetic correlation for Euclidean distance and weighted linkage is 0.8824190968054058. Cophenetic correlation for Chebyshev distance and single linkage is 0.944359706184292. Cophenetic correlation for Chebyshev distance and complete linkage is 0.8863927240886589. Cophenetic correlation for Chebyshev distance and average linkage is 0.9381981458216698. Error for Chebyshev distance and centroid linkage: `method=centroid` requires the distance metric to be Euclidean Error for Chebyshev distance and ward linkage: `method=ward` requires the distance metric to be Euclidean Cophenetic correlation for Chebyshev distance and weighted linkage is 0.8839528285551623. Cophenetic correlation for Mahalanobis distance and single linkage is 0.8688554364012083. Cophenetic correlation for Mahalanobis distance and complete linkage is 0.49533463354436585. Cophenetic correlation for Mahalanobis distance and average linkage is 0.8380995772123909. Error for Mahalanobis distance and centroid linkage: `method=centroid` requires the distance metric to be Euclidean Error for Mahalanobis distance and ward linkage: `method=ward` requires the distance metric to be Euclidean Cophenetic correlation for Mahalanobis distance and weighted linkage is 0.7167075784099872. Cophenetic correlation for Cityblock distance and single linkage is 0.9209089428269329. Cophenetic correlation for Cityblock distance and complete linkage is 0.8440926567175726. Cophenetic correlation for Cityblock distance and average linkage is 0.9363186631034428. Error for Cityblock distance and centroid linkage: `method=centroid` requires the distance metric to be Euclidean Error for Cityblock distance and ward linkage: `method=ward` requires the distance metric to be Euclidean Cophenetic correlation for Cityblock distance and weighted linkage is 0.8593697074578455. CPU times: user 197 ms, sys: 102 ms, total: 299 ms Wall time: 185 ms
# Print the ideal combination
print(
'Highest cophenetic correlation is {}, which is obtained with {} distance and {} linkage.'.format(
high_cophenet_corr, high_dm_lm[0].capitalize(), high_dm_lm[1]
)
)
Highest cophenetic correlation is 0.9515911691498813, which is obtained with Euclidean distance and single linkage.
The above code tells us that the best combination is Euclidean distance with single linkage. However, we need to visual the dendograms to see if this recommendation is suitable or not.
Let's re-run linkages, but using only Euclidian distances this time.
# View dendograms
# Create list of linkage methods
linkage_methods = ['single', 'complete', 'average', 'centroid', 'ward', 'weighted'] ## Complete the code to add linkages
# Create lists to save results
compare_cols = ['Linkage', 'Cophenetic Coefficient']
# Create subplot
fig, axs = plt.subplots(len(linkage_methods), 1, figsize=(15, 30))
# Loop through thelinkage methods and plot dendrogram and calculate cophenetic correlation
for i, method in enumerate(linkage_methods):
Z = linkage(df_hierarchical, metric='euclidean', method=method)
dendrogram(Z, ax=axs[i])
axs[i].set_title(f'Dendrogram ({method.capitalize()} Linkage)')
coph_corr, coph_dist = cophenet(Z, pdist(df_hierarchical))
axs[i].annotate(
f'Cophenetic\nCorrelation\n{coph_corr:0.2f}',
(0.80, 0.80),
xycoords='axes fraction',
)
Next, we need to fit a new model using our chosen distance and linkage approach.
Single, centroid and average have the highest cophenetic correlation values, but also exhibit overlaps:
Single (Cophenetic Correlation = .95)
Several of the dendograms have lower scores but appear well organized.
Weighted (Cophenetic Correlation = .88)
We will choose complete, as the dendogram is very clean, and the value is not much lower than the single linkage (.95). In addition, we'll choose a cluster count of 5 again.
# Fit the model with the desired number of clusters and selected distance and linkage type
# Note that affinity is now called metric in the function below
df_model_hc_5 = AgglomerativeClustering(n_clusters=5, metric='euclidean', linkage='complete')
df_model_hc_5.fit(df_hierarchical)
AgglomerativeClustering(linkage='complete', n_clusters=5)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
AgglomerativeClustering(linkage='complete', n_clusters=5)
# Apply the hierarchical clustering to a final model
df_hc = df_model.copy()
# Add labels
df_hierarchical['Cluster'] = df_model_hc_5.labels_
df_hc['Cluster'] = df_model_hc_5.labels_
# Group and count the results
display(df_hc['Cluster'].value_counts())
| count | |
|---|---|
| Cluster | |
| 1 | 318 |
| 0 | 10 |
| 2 | 7 |
| 3 | 3 |
| 4 | 2 |
# Show the securities in each cluster
for cluster in df_hc['Cluster'].unique():
cluster_securities = df_hc[df_hc['Cluster'] == cluster]['Security']
print(f'Cluster {cluster}: {", ".join(cluster_securities)}')
Cluster 1: American Airlines Group, AbbVie, Abbott Laboratories, Adobe Systems Inc, Analog Devices, Inc., Archer-Daniels-Midland Co, Alliance Data Systems, Ameren Corp, American Electric Power, AFLAC Inc, American International Group, Inc., Apartment Investment & Mgmt, Assurant Inc, Arthur J. Gallagher & Co., Akamai Technologies Inc, Albemarle Corp, Alaska Air Group Inc, Allstate Corp, Applied Materials Inc, AMETEK Inc, Affiliated Managers Group Inc, Amgen Inc, Ameriprise Financial, American Tower Corp A, AutoNation Inc, Anthem Inc., Aon plc, Anadarko Petroleum Corp, Amphenol Corp, Arconic Inc, Activision Blizzard, AvalonBay Communities, Inc., Broadcom, American Water Works Company Inc, American Express Co, Boeing Company, Baxter International Inc., BB&T Corporation, Bard (C.R.) Inc., Baker Hughes Inc, BIOGEN IDEC Inc., The Bank of New York Mellon Corp., Ball Corp, Bristol-Myers Squibb, Boston Scientific, BorgWarner, Boston Properties, Caterpillar Inc., Chubb Limited, CBRE Group, Crown Castle International Corp., Carnival Corp., Celgene Corp., CF Industries Holdings Inc, Citizens Financial Group, Church & Dwight, C. H. Robinson Worldwide, CIGNA Corp., Cincinnati Financial, Comerica Inc., CME Group Inc., Chipotle Mexican Grill, Cummins Inc., CMS Energy, Centene Corporation, CenterPoint Energy, Capital One Financial, Cabot Oil & Gas, The Cooper Companies, CSX Corp., CenturyLink Inc, Cognizant Technology Solutions, Citrix Systems, CVS Health, Chevron Corp., Concho Resources, Dominion Resources, Delta Air Lines, Du Pont (E.I.), Deere & Co., Discover Financial Services, Quest Diagnostics, Danaher Corp., The Walt Disney Company, Discovery Communications-A, Discovery Communications-C, Delphi Automotive, Digital Realty Trust, Dun & Bradstreet, Dover Corp., Dr Pepper Snapple Group, Duke Energy, DaVita Inc., Devon Energy Corp., eBay Inc., Ecolab Inc., Consolidated Edison, Equifax Inc., Edison Int'l, Eastman Chemical, EOG Resources, Equinix, Equity Residential, EQT Corporation, Eversource Energy, Essex Property Trust, Inc., E*Trade, Eaton Corporation, Entergy Corp., Edwards Lifesciences, Expeditors Int'l, Expedia Inc., Extra Space Storage, Fastenal Co, Facebook, Fortune Brands Home & Security, Freeport-McMoran Cp & Gld, FirstEnergy Corp, Fidelity National Information Services, Fiserv Inc, FLIR Systems, Fluor Corp., Flowserve Corporation, FMC Corporation, Federal Realty Investment Trust, First Solar Inc, Frontier Communications, General Dynamics, General Growth Properties Inc., Gilead Sciences, Corning Inc., General Motors, Genuine Parts, Garmin Ltd., Goodyear Tire & Rubber, Grainger (W.W.) Inc., Hasbro Inc., Huntington Bancshares, HCA Holdings, Welltower Inc., HCP Inc., Hess Corporation, Hartford Financial Svc.Gp., Harley-Davidson, Honeywell Int'l Inc., HP Inc., Hormel Foods Corp., Henry Schein, Host Hotels & Resorts, The Hershey Company, Humana Inc., International Business Machines, IDEXX Laboratories, Intl Flavors & Fragrances, International Paper, Interpublic Group, Iron Mountain Incorporated, Intuitive Surgical Inc., Illinois Tool Works, Invesco Ltd., J. B. Hunt Transport Services, Jacobs Engineering Group, Juniper Networks, Kimco Realty, Kinder Morgan, Coca Cola Company, Kansas City Southern, Leggett & Platt, Lennar Corp., Laboratory Corp. of America Holding, LKQ Corporation, L-3 Communications Holdings, Lilly (Eli) & Co., Lockheed Martin Corp., Alliant Energy Corp, Leucadia National Corp., Southwest Airlines, Level 3 Communications, LyondellBasell, Mastercard Inc., Mid-America Apartments, Macerich, Marriott Int'l., Masco Corp., Mattel Inc., Moody's Corp, Mondelez International, MetLife Inc., Mohawk Industries, Mead Johnson, McCormick & Co., Martin Marietta Materials, Marsh & McLennan, 3M Company, Monster Beverage, Altria Group Inc, The Mosaic Company, Marathon Petroleum, Merck & Co., Marathon Oil Corp., M&T Bank Corp., Mettler Toledo, Murphy Oil, Mylan N.V., Navient, Noble Energy Inc, NASDAQ OMX Group, NextEra Energy, Newmont Mining Corp. (Hldg. Co.), Newfield Exploration Co, Nielsen Holdings, National Oilwell Varco Inc., Norfolk Southern Corp., Nucor Corp., Newell Brands, Realty Income Corporation, ONEOK, Omnicom Group, O'Reilly Automotive, Occidental Petroleum, People's United Financial, Pitney-Bowes, PACCAR Inc., PG&E Corp., Public Serv. Enterprise Inc., PepsiCo Inc., Pfizer Inc., Principal Financial Group, Procter & Gamble, Progressive Corp., Pulte Homes Inc., Philip Morris International, PNC Financial Services, Pentair Ltd., Pinnacle West Capital, PPG Industries, PPL Corp., Prudential Financial, Phillips 66, Quanta Services Inc., Praxair Inc., PayPal, Ryder System, Royal Caribbean Cruises Ltd, Regeneron, Robert Half International, Roper Industries, Range Resources Corp., Republic Services Inc, SCANA Corp, Charles Schwab Corporation, Spectra Energy Corp., Sealed Air, Sherwin-Williams, SL Green Realty, Scripps Networks Interactive Inc., Southern Co., Simon Property Group Inc, Stericycle Inc, Sempra Energy, SunTrust Banks, State Street Corp., Skyworks Solutions, Southwestern Energy, Synchrony Financial, Stryker Corp., AT&T Inc, Molson Coors Brewing Company, Teradata Corp., Tegna, Inc., Torchmark Corp., Thermo Fisher Scientific, TripAdvisor, The Travelers Companies Inc., Tractor Supply Company, Tyson Foods, Tesoro Petroleum Co., Total System Services, Texas Instruments, Under Armour, United Continental Holdings, UDR Inc, Universal Health Services, Inc., United Health Group Inc., Unum Group, Union Pacific, United Parcel Service, United Technologies, Varian Medical Systems, Valero Energy, Vulcan Materials, Vornado Realty Trust, Verisk Analytics, Verisign Inc., Vertex Pharmaceuticals Inc, Ventas Inc, Waters Corporation, Wec Energy Group Inc, Wells Fargo, Whirlpool Corp., Waste Management Inc., Williams Cos., Western Union Co, Weyerhaeuser Corp., Wyndham Worldwide, Wynn Resorts Ltd, Cimarex Energy, Xcel Energy Inc, XL Capital, Exxon Mobil Corp., Dentsply Sirona, Xerox Corp., Xylem Inc., Yahoo Inc., Yum! Brands Inc, Zimmer Biomet Holdings, Zions Bancorp, Zoetis Cluster 2: Allegion, Apache Corporation, Chesapeake Energy, Charter Communications, Colgate-Palmolive, Kimberly-Clark, S&P Global, Inc. Cluster 0: Alexion Pharmaceuticals, Amazon.com Inc, Exelon Corp., Ford Motor, Halliburton Co., Hewlett Packard Enterprise, McDonald's Corp., Netflix Inc., Northern Trust Corp., Priceline.com Inc Cluster 4: Bank of America Corp, Intel Corp. Cluster 3: Citigroup Inc., JPMorgan Chase & Co., Verizon Communications
# Loop through the selected columns and plot boxplots
for col in list_numcol:
if col in df_hc.columns: # Ensure the column exists in the DataFrame
plt.figure(figsize=(8, 5))
sns.boxplot(data=df_hc, x='Cluster', y=col, hue='Cluster', palette='viridis', legend='full')
plt.xticks(rotation=90)
plt.title(f"Boxplot of {col} by Cluster")
plt.show()
OBSERVATIONS
Using five clusters in hierarchical clustering assigned companies as follows:
Cluster 0 (10 companies)
Alexion Pharmaceuticals, Amazon.com Inc, Exelon Corp., Ford Motor, Halliburton Co., Hewlett Packard Enterprise, McDonald's Corp., Netflix Inc., Northern Trust Corp., Priceline.com Inc
Cluster 2 (7 companies)
Allegion, Apache Corporation, Chesapeake Energy, Charter Communications, Colgate-Palmolive, Kimberly-Clark, S&P Global, Inc.
Cluster 3 (3 companies)
Citigroup Inc., JPMorgan Chase & Co., Verizon Communications
Cluster 4 (2 companies)
Bank of America Corp, Intel Corp.
Cluster 1 (All other 318 companies)
American Airlines Group, AbbVie, Abbott Laboratories, Adobe Systems Inc, Analog Devices, Inc., Archer-Daniels-Midland Co, Alliance Data Systems, Ameren Corp, American Electric Power, AFLAC Inc, American International Group, Inc., Apartment Investment & Mgmt, Assurant Inc, Arthur J. Gallagher & Co., Akamai Technologies Inc, Albemarle Corp, Alaska Air Group Inc, Allstate Corp, Applied Materials Inc, AMETEK Inc, Affiliated Managers Group Inc, Amgen Inc, Ameriprise Financial, American Tower Corp A, AutoNation Inc, Anthem Inc., Aon plc, Anadarko Petroleum Corp, Amphenol Corp, Arconic Inc, Activision Blizzard, AvalonBay Communities, Inc., Broadcom, American Water Works Company Inc, American Express Co, Boeing Company, Baxter International Inc., BB&T Corporation, Bard (C.R.) Inc., Baker Hughes Inc, BIOGEN IDEC Inc., The Bank of New York Mellon Corp., Ball Corp, Bristol-Myers Squibb, Boston Scientific, BorgWarner, Boston Properties, Caterpillar Inc., Chubb Limited, CBRE Group, Crown Castle International Corp., Carnival Corp., Celgene Corp., CF Industries Holdings Inc, Citizens Financial Group, Church & Dwight, C. H. Robinson Worldwide, CIGNA Corp., Cincinnati Financial, Comerica Inc., CME Group Inc., Chipotle Mexican Grill, Cummins Inc., CMS Energy, Centene Corporation, CenterPoint Energy, Capital One Financial, Cabot Oil & Gas, The Cooper Companies, CSX Corp., CenturyLink Inc, Cognizant Technology Solutions, Citrix Systems, CVS Health, Chevron Corp., Concho Resources, Dominion Resources, Delta Air Lines, Du Pont (E.I.), Deere & Co., Discover Financial Services, Quest Diagnostics, Danaher Corp., The Walt Disney Company, Discovery Communications-A, Discovery Communications-C, Delphi Automotive, Digital Realty Trust, Dun & Bradstreet, Dover Corp., Dr Pepper Snapple Group, Duke Energy, DaVita Inc., Devon Energy Corp., eBay Inc., Ecolab Inc., Consolidated Edison, Equifax Inc., Edison Int'l, Eastman Chemical, EOG Resources, Equinix, Equity Residential, EQT Corporation, Eversource Energy, Essex Property Trust, Inc., E*Trade, Eaton Corporation, Entergy Corp., Edwards Lifesciences, Expeditors Int'l, Expedia Inc., Extra Space Storage, Fastenal Co, Facebook, Fortune Brands Home & Security, Freeport-McMoran Cp & Gld, FirstEnergy Corp, Fidelity National Information Services, Fiserv Inc, FLIR Systems, Fluor Corp., Flowserve Corporation, FMC Corporation, Federal Realty Investment Trust, First Solar Inc, Frontier Communications, General Dynamics, General Growth Properties Inc., Gilead Sciences, Corning Inc., General Motors, Genuine Parts, Garmin Ltd., Goodyear Tire & Rubber, Grainger (W.W.) Inc., Hasbro Inc., Huntington Bancshares, HCA Holdings, Welltower Inc., HCP Inc., Hess Corporation, Hartford Financial Svc.Gp., Harley-Davidson, Honeywell Int'l Inc., HP Inc., Hormel Foods Corp., Henry Schein, Host Hotels & Resorts, The Hershey Company, Humana Inc., International Business Machines, IDEXX Laboratories, Intl Flavors & Fragrances, International Paper, Interpublic Group, Iron Mountain Incorporated, Intuitive Surgical Inc., Illinois Tool Works, Invesco Ltd., J. B. Hunt Transport Services, Jacobs Engineering Group, Juniper Networks, Kimco Realty, Kinder Morgan, Coca Cola Company, Kansas City Southern, Leggett & Platt, Lennar Corp., Laboratory Corp. of America Holding, LKQ Corporation, L-3 Communications Holdings, Lilly (Eli) & Co., Lockheed Martin Corp., Alliant Energy Corp, Leucadia National Corp., Southwest Airlines, Level 3 Communications, LyondellBasell, Mastercard Inc., Mid-America Apartments, Macerich, Marriott Int'l., Masco Corp., Mattel Inc., Moody's Corp, Mondelez International, MetLife Inc., Mohawk Industries, Mead Johnson, McCormick & Co., Martin Marietta Materials, Marsh & McLennan, 3M Company, Monster Beverage, Altria Group Inc, The Mosaic Company, Marathon Petroleum, Merck & Co., Marathon Oil Corp., M&T Bank Corp., Mettler Toledo, Murphy Oil, Mylan N.V., Navient, Noble Energy Inc, NASDAQ OMX Group, NextEra Energy, Newmont Mining Corp. (Hldg. Co.), Newfield Exploration Co, Nielsen Holdings, National Oilwell Varco Inc., Norfolk Southern Corp., Nucor Corp., Newell Brands, Realty Income Corporation, ONEOK, Omnicom Group, O'Reilly Automotive, Occidental Petroleum, People's United Financial, Pitney-Bowes, PACCAR Inc., PG&E Corp., Public Serv. Enterprise Inc., PepsiCo Inc., Pfizer Inc., Principal Financial Group, Procter & Gamble, Progressive Corp., Pulte Homes Inc., Philip Morris International, PNC Financial Services, Pentair Ltd., Pinnacle West Capital, PPG Industries, PPL Corp., Prudential Financial, Phillips 66, Quanta Services Inc., Praxair Inc., PayPal, Ryder System, Royal Caribbean Cruises Ltd, Regeneron, Robert Half International, Roper Industries, Range Resources Corp., Republic Services Inc, SCANA Corp, Charles Schwab Corporation, Spectra Energy Corp., Sealed Air, Sherwin-Williams, SL Green Realty, Scripps Networks Interactive Inc., Southern Co., Simon Property Group Inc, Stericycle Inc, Sempra Energy, SunTrust Banks, State Street Corp., Skyworks Solutions, Southwestern Energy, Synchrony Financial, Stryker Corp., AT&T Inc, Molson Coors Brewing Company, Teradata Corp., Tegna, Inc., Torchmark Corp., Thermo Fisher Scientific, TripAdvisor, The Travelers Companies Inc., Tractor Supply Company, Tyson Foods, Tesoro Petroleum Co., Total System Services, Texas Instruments, Under Armour, United Continental Holdings, UDR Inc, Universal Health Services, Inc., United Health Group Inc., Unum Group, Union Pacific, United Parcel Service, United Technologies, Varian Medical Systems, Valero Energy, Vulcan Materials, Vornado Realty Trust, Verisk Analytics, Verisign Inc., Vertex Pharmaceuticals Inc, Ventas Inc, Waters Corporation, Wec Energy Group Inc, Wells Fargo, Whirlpool Corp., Waste Management Inc., Williams Cos., Western Union Co, Weyerhaeuser Corp., Wyndham Worldwide, Wynn Resorts Ltd, Cimarex Energy, Xcel Energy Inc, XL Capital, Exxon Mobil Corp., Dentsply Sirona, Xerox Corp., Xylem Inc., Yahoo Inc., Yum! Brands Inc, Zimmer Biomet Holdings, Zions Bancorp, Zoetis
Let's see if we can use PCA and t-SNE to visualize our clusters, now that we have our cluster count. We will use both the K-means and Hierarchical methods and compare.
# Copy datafrase
df_pca = df_model_scaled.copy()
# Reduce to higher dimensions first with PCA, using n_components from 2 to 10, and kmeans results *cluster=5)
for i in np.arange(2, 11):
# Initialize PCA model
pca = PCA(n_components=i) # or any number you want
X_pca = pca.fit_transform(df_pca)
# Then apply t-SNE
tsne = TSNE(n_components=2, random_state=8)
X_reduced = tsne.fit_transform(X_pca)
# Perform clustering on the reduced data (using KMeans cluster count here)
kmeans = KMeans(n_clusters=5, random_state=8)
clusters = kmeans.fit_predict(X_reduced)
# Put t-SNE results in a dataframe
df_tsne = pd.DataFrame(data=X_reduced, columns=['Component 1', 'Component 2'])
# Add the cluster labels to the dataframe
df_tsne['Cluster'] = clusters
# Show it in a scatterplot
plt.figure(figsize=(10, 8))
sns.scatterplot(data=df_tsne, x='Component 1', y='Component 2', hue='Cluster', palette='viridis', legend='full')
# Customize the plot
plt.title(f't-SNE Visualization of K-Means Clusters: {i}')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.legend(title='Cluster')
plt.show()
# Reduce to higher dimensions first with PCA, using n_components from 2 to 10
for i in np.arange(2, 11):
# Initialize PCA model
pca = PCA(n_components=i)
X_pca = pca.fit_transform(df_pca)
# Then apply t-SNE
tsne = TSNE(n_components=2, random_state=8)
X_reduced = tsne.fit_transform(X_pca)
# Perform hierarchical clustering (AgglomerativeClustering) on the reduced data
hierarchical = AgglomerativeClustering(n_clusters=5) # Set number of clusters to 5
clusters = hierarchical.fit_predict(X_reduced)
# Put t-SNE results in a dataframe
df_tsne = pd.DataFrame(data=X_reduced, columns=['Component 1', 'Component 2'])
# Add the cluster labels to the dataframe
df_tsne['Cluster'] = clusters
# Show it in a scatterplot
plt.figure(figsize=(10, 8))
sns.scatterplot(data=df_tsne, x='Component 1', y='Component 2', hue='Cluster', palette='viridis', legend='full')
# Customize the plot
plt.title(f't-SNE Visualization of Hierarchical Clustering (n_clusters=5): {i}')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.legend(title='Cluster')
plt.show()
Next, let's determine which approach to take: K-means or Hierarchical.
Execution Time
Hierarchical runs more than twice as fast as K-means:
K-means CPU times
Hierarchical CPU times
Ideal Number of Clusters
Cluster Distinction
Visualizing the clusters in each method for K-means, the elbow method reveals a very distinct elbow between a count of 5 and 6. For clusters 6 and higher, the gained distortion reduction tapers off.
For Hierarchical, our Dendogram for complete linkage is well-organized for either 5 or 6 clusters.
Given both inputs, we will choose 5 clusters, resulting in the following data point assignents.
Cluster Profile Result Comparision
K-means performed best with five clusters, allocated as follows:
Hierarchical also has five clusters, with a similar allocation. In this algorithm, more data points were assigned to Cluster 1, while other clusters saw reduced counts.
Next, let's aggregate the data by cluster so that we can compare various features per cluster.
# Create cluster-specific dataframes
df_kmean_cluster_0 = df_kmean[df_kmean['Cluster'] == 0]
df_kmean_cluster_1 = df_kmean[df_kmean['Cluster'] == 1]
df_kmean_cluster_2 = df_kmean[df_kmean['Cluster'] == 2]
df_kmean_cluster_3 = df_kmean[df_kmean['Cluster'] == 3]
df_kmean_cluster_4 = df_kmean[df_kmean['Cluster'] == 4]
list_clusters = [0,1,2,3,4]
# View details for Cluster means
for col in num_cols:
print(f'\n****** Feature {col} ******')
for cluster in list_clusters:
cluster_data = df_kmean[df_kmean['Cluster'] == cluster]
print(f'Cluster {cluster} mean for {col}: {cluster_data[col].mean()}')
****** Feature Current Price ****** Cluster 0 mean for Current Price: 48.73499975000001 Cluster 1 mean for Current Price: 64.210386025625 Cluster 2 mean for Current Price: 84.35571614285713 Cluster 3 mean for Current Price: 83.29782519843894 Cluster 4 mean for Current Price: 25.6400005 ****** Feature Price Change ****** Cluster 0 mean for Price Change: 5.318747360916667 Cluster 1 mean for Price Change: -2.1037755721874998 Cluster 2 mean for Price Change: 3.854980541142857 Cluster 3 mean for Price Change: 4.313401507676567 Cluster 4 mean for Price Change: 11.2379082245 ****** Feature Volatility ****** Cluster 0 mean for Volatility: 1.2708920399166668 Cluster 1 mean for Volatility: 2.11978607225 Cluster 2 mean for Volatility: 1.827670343428571 Cluster 3 mean for Volatility: 1.499096510122112 Cluster 4 mean for Volatility: 1.32235485 ****** Feature ROE ****** Cluster 0 mean for ROE: 26.0 Cluster 1 mean for ROE: 178.5 Cluster 2 mean for ROE: 633.5714285714286 Cluster 3 mean for ROE: 19.257425742574256 Cluster 4 mean for ROE: 12.5 ****** Feature Cash Ratio ****** Cluster 0 mean for Cash Ratio: 71.75 Cluster 1 mean for Cash Ratio: 64.125 Cluster 2 mean for Cash Ratio: 33.57142857142857 Cluster 3 mean for Cash Ratio: 70.7095709570957 Cluster 4 mean for Cash Ratio: 130.5 ****** Feature Net Cash Flow ****** Cluster 0 mean for Net Cash Flow: -4070833333.3333335 Cluster 1 mean for Net Cash Flow: 6140875.0 Cluster 2 mean for Net Cash Flow: -568400000.0 Cluster 3 mean for Net Cash Flow: 125750287.12871288 Cluster 4 mean for Net Cash Flow: 16755500000.0 ****** Feature Net Income ****** Cluster 0 mean for Net Income: 11045666666.666666 Cluster 1 mean for Net Income: -603718000.0 Cluster 2 mean for Net Income: -4968157142.857142 Cluster 3 mean for Net Income: 1295945059.4059405 Cluster 4 mean for Net Income: 13654000000.0 ****** Feature Earnings Per Share ****** Cluster 0 mean for Earnings Per Share: 2.03 Cluster 1 mean for Earnings Per Share: -3.2075 Cluster 2 mean for Earnings Per Share: -10.84142857142857 Cluster 3 mean for Earnings Per Share: 3.433415841584158 Cluster 4 mean for Earnings Per Share: 3.295 ****** Feature Estimated Shares Outstanding ****** Cluster 0 mean for Estimated Shares Outstanding: 3507085088.125 Cluster 1 mean for Estimated Shares Outstanding: 547905122.985625 Cluster 2 mean for Estimated Shares Outstanding: 398169036.44285715 Cluster 3 mean for Estimated Shares Outstanding: 452037260.1709571 Cluster 4 mean for Estimated Shares Outstanding: 2791829362.1 ****** Feature P/E Ratio ****** Cluster 0 mean for P/E Ratio: 26.986866739666667 Cluster 1 mean for P/E Ratio: 39.2437444469375 Cluster 2 mean for P/E Ratio: 42.28454148142857 Cluster 3 mean for P/E Ratio: 32.38692420680858 Cluster 4 mean for P/E Ratio: 13.649695574999999 ****** Feature P/B Ratio ****** Cluster 0 mean for P/B Ratio: -6.238311734833332 Cluster 1 mean for P/B Ratio: 0.6717558973125 Cluster 2 mean for P/B Ratio: -11.589501539285711 Cluster 3 mean for P/B Ratio: -1.4586917297656765 Cluster 4 mean for P/B Ratio: 1.5084839159999999
# Merge individual dataframes and write comparisions to new table
df_all_clusters = [df_kmean_cluster_0, df_kmean_cluster_1, df_kmean_cluster_2, df_kmean_cluster_3, df_kmean_cluster_4]
mean_table = pd.DataFrame()
for idx, df in enumerate(df_all_clusters):
temp_mean_dict = {}
for col in num_cols:
if pd.api.types.is_numeric_dtype(df[col]):
other_dfs = [df_all_clusters[i] for i in range(len(df_all_clusters)) if i != idx] # Exclude the current dataframe
col_means = [other_df[col].mean() for other_df in other_dfs]
temp_mean_dict[col] = sum(col_means) / len(col_means)
else:
temp_mean_dict[col] = 'Non-numeric'
# Add results
temp_mean_df = pd.DataFrame([temp_mean_dict], index=[f'DF {idx}'])
mean_table = pd.concat([mean_table, temp_mean_df])
# Set pandas option to display float values with 6 decimal places
pd.options.display.float_format = '{:,.2f}'.format
mean_table
#import ace_tools as tools; tools.display_dataframe_to_user(name="Mean Table from Other DataFrames", dataframe=mean_table)
| Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| DF 0 | 64.38 | 4.33 | 1.69 | 210.96 | 74.73 | 4,079,747,790.53 | 2,344,517,479.14 | -1.83 | 1,047,485,195.42 | 31.89 | -2.72 |
| DF 1 | 60.51 | 6.18 | 1.48 | 172.83 | 76.63 | 3,060,504,238.45 | 5,256,863,645.80 | -0.52 | 1,787,280,186.71 | 28.83 | -4.44 |
| DF 2 | 55.47 | 4.69 | 1.55 | 59.06 | 84.27 | 3,204,139,457.20 | 6,347,973,431.52 | 1.39 | 1,824,714,208.35 | 28.07 | -1.38 |
| DF 3 | 55.74 | 4.58 | 1.64 | 212.64 | 74.99 | 3,030,601,885.42 | 4,781,947,880.95 | -2.18 | 1,811,247,152.41 | 30.54 | -3.91 |
| DF 4 | 70.15 | 2.85 | 1.68 | 214.33 | 60.04 | -1,126,835,542.80 | 1,692,434,145.80 | -2.15 | 1,226,299,126.93 | 35.23 | -4.65 |
# Loop through the selected columns and plot boxplots in two columns
n_cols = 2
# Iterate through numberical columns
for i, col in enumerate(list_numcol):
if col in df_kmean.columns:
# Set up subplots
fig, axs = plt.subplots(1, n_cols, figsize=(16, 5))
# Create boxplot
sns.boxplot(data=df_kmean, x='Cluster', y=col, ax=axs[0], hue='Cluster', palette='viridis')
axs[0].set_title(f'Boxplot of {col} by Cluster')
axs[0].set_xticklabels(axs[0].get_xticklabels(), rotation=90)
if i + 1 < len(list_numcol):
next_col = list_numcol[i + 1]
if next_col in df_kmean.columns:
sns.boxplot(data=df_kmean, x='Cluster', y=next_col, ax=axs[1], hue='Cluster', palette='viridis')
axs[1].set_title(f'Boxplot of {next_col} by Cluster')
axs[1].set_xticklabels(axs[1].get_xticklabels(), rotation=90)
else:
axs[1].axis('off')
plt.tight_layout()
plt.show()
if i + 1 < len(list_numcol):
i += 1
# Show the means
df_means_by_cluster = df_kmean.groupby('Cluster')[list_numcol].mean()
df_means_by_cluster = df_means_by_cluster.T # Transpose to get clusters as columns
df_means_by_cluster
| Cluster | 0 | 1 | 2 | 3 | 4 |
|---|---|---|---|---|---|
| Current Price | 48.73 | 64.21 | 84.36 | 83.30 | 25.64 |
| Price Change | 5.32 | -2.10 | 3.85 | 4.31 | 11.24 |
| Volatility | 1.27 | 2.12 | 1.83 | 1.50 | 1.32 |
| ROE | 26.00 | 178.50 | 633.57 | 19.26 | 12.50 |
| Cash Ratio | 71.75 | 64.12 | 33.57 | 70.71 | 130.50 |
| Net Cash Flow | -4,070,833,333.33 | 6,140,875.00 | -568,400,000.00 | 125,750,287.13 | 16,755,500,000.00 |
| Net Income | 11,045,666,666.67 | -603,718,000.00 | -4,968,157,142.86 | 1,295,945,059.41 | 13,654,000,000.00 |
| Earnings Per Share | 2.03 | -3.21 | -10.84 | 3.43 | 3.29 |
| Estimated Shares Outstanding | 3,507,085,088.12 | 547,905,122.99 | 398,169,036.44 | 452,037,260.17 | 2,791,829,362.10 |
| P/E Ratio | 26.99 | 39.24 | 42.28 | 32.39 | 13.65 |
| P/B Ratio | -6.24 | 0.67 | -11.59 | -1.46 | 1.51 |
Finally, let's display boxplots of the model and the means for each numerical metric, so we can compare clusters.
# Plot boxplots with mean
for col in list_numcol:
if col in df_kmean.columns:
plt.figure(figsize=(8, 5))
sns.boxplot(data=df_kmean, x='Cluster', y=col, hue='Cluster', palette='viridis', showmeans=False)
# Use an overlaid pointplot to show means
sns.pointplot(data=df_kmean, x='Cluster', y=col, hue='Cluster', dodge=True, markers='D',
scale=0.75, ci=None, palette='dark', legend=False)
plt.xticks(rotation=90)
plt.title(f'Boxplot of {col} by Cluster')
plt.show()
Finally, let's determine which algorithm worked the best.
For both K-means and Hierarchical, the plot density decreases as the number of clusters (K-means) or components (Hierarchical) increases.
Since no extra definitive information surfaced during our PCA analysis, we will retain K-means and 5 clusters as our final model. This more evenly allocated the data points into different clusters, and since the dataset is small, we are not as concerned with increased execution time.
Some of the clusters have notable differences within the feature, as we can see in the boxplots. Other clusters show little difference, indicating that the feature may not be that important. Without normalizing the y-axis, it's difficult to compare them relative to one another.
Next, we will revisit the numerical results to determine the characteristics of various clusters.
# Re-display final results from our k-means analysis
mean_table
| Current Price | Price Change | Volatility | ROE | Cash Ratio | Net Cash Flow | Net Income | Earnings Per Share | Estimated Shares Outstanding | P/E Ratio | P/B Ratio | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| DF 0 | 64.38 | 4.33 | 1.69 | 210.96 | 74.73 | 4,079,747,790.53 | 2,344,517,479.14 | -1.83 | 1,047,485,195.42 | 31.89 | -2.72 |
| DF 1 | 60.51 | 6.18 | 1.48 | 172.83 | 76.63 | 3,060,504,238.45 | 5,256,863,645.80 | -0.52 | 1,787,280,186.71 | 28.83 | -4.44 |
| DF 2 | 55.47 | 4.69 | 1.55 | 59.06 | 84.27 | 3,204,139,457.20 | 6,347,973,431.52 | 1.39 | 1,824,714,208.35 | 28.07 | -1.38 |
| DF 3 | 55.74 | 4.58 | 1.64 | 212.64 | 74.99 | 3,030,601,885.42 | 4,781,947,880.95 | -2.18 | 1,811,247,152.41 | 30.54 | -3.91 |
| DF 4 | 70.15 | 2.85 | 1.68 | 214.33 | 60.04 | -1,126,835,542.80 | 1,692,434,145.80 | -2.15 | 1,226,299,126.93 | 35.23 | -4.65 |
As the last step in our analysis, we can examine details of each cluster resulting from K-means with 5 clusters.
This cluster's data encapsulates large, major companies with moderate growth potential and strong profitability, but possibly poor management of cash.
Metrics
Companies
Anadarko Petroleum Corp, Bristol-Myers Squibb, Citigroup Inc., eBay Inc., General Motors, JPMorgan Chase & Co., Coca Cola Company, Pfizer Inc., AT&T Inc, Verizon Communications, Wells Fargo, Exxon Mobil Corp.
Sectors and Sub Industries
# @title
# Show pivot of sectors and subsectors with counts
df_kmean_cluster_0_pivot = df_kmean_cluster_0.pivot_table(
index=['GICS Sector', 'GICS Sub Industry'],
values='Security',
aggfunc={'Security': 'count'},
fill_value=0
).reset_index()
# Rename columns
df_kmean_cluster_0_pivot.columns = ['GICS Sector', 'GICS Sub Industry', 'Company Count']
# Calculate grand total
grand_total_count = df_kmean_cluster_0_pivot['Company Count'].sum()
grand_total_df = pd.DataFrame({
'GICS Sector': ['Grand Total'],
'GICS Sub Industry': [''],
'Company Count': [grand_total_count]
})
# Combine counts with grand total
combined = pd.concat([df_kmean_cluster_0_pivot, grand_total_df], ignore_index=True)
# Display the combined DataFrame
print('Cluster 0: Sector, Sub Industry and Company Count')
combined
Cluster 0: Sector, Sub Industry and Company Count
| GICS Sector | GICS Sub Industry | Company Count | |
|---|---|---|---|
| 0 | Consumer Discretionary | Automobile Manufacturers | 1 |
| 1 | Consumer Staples | Soft Drinks | 1 |
| 2 | Energy | Integrated Oil & Gas | 1 |
| 3 | Energy | Oil & Gas Exploration & Production | 1 |
| 4 | Financials | Banks | 3 |
| 5 | Health Care | Health Care Distributors | 1 |
| 6 | Health Care | Pharmaceuticals | 1 |
| 7 | Information Technology | Internet Software & Services | 1 |
| 8 | Telecommunications Services | Integrated Telecommunications Services | 2 |
| 9 | Grand Total | 12 |
Companies in this cluster are large but have poor earnings performance and show high volatility. Seeing both a negative earnings and declining stock price could indicate underlying problems. Investing in companies in the cluster requires research to find the ones with potential for turnaround.
Metrics
Companies
American Airlines Group, AbbVie, Devon Energy Corp., Freeport-McMoran Cp & Gld, IDEXX Laboratories, Masco Corp., Moody's Corp, Altria Group Inc, Newfield Exploration Co, Pitney-Bowes, Sherwin-Williams, Southwestern Energy, United Parcel Service, Wynn Resorts Ltd, Cimarex Energy, Yum! Brands
Sectors and Sub Industries
# @title
# Show pivot of sectors and subsectors with counts
df_kmean_cluster_1_pivot = df_kmean_cluster_1.pivot_table(
index=['GICS Sector', 'GICS Sub Industry'],
values='Security',
aggfunc={'Security': 'count'},
fill_value=0
).reset_index()
# Rename columns
df_kmean_cluster_1_pivot.columns = ['GICS Sector', 'GICS Sub Industry', 'Company Count']
# Calculate grand total
grand_total_count = df_kmean_cluster_1_pivot['Company Count'].sum()
grand_total_df = pd.DataFrame({
'GICS Sector': ['Grand Total'],
'GICS Sub Industry': [''],
'Company Count': [grand_total_count]
})
# Combine counts with grand total
combined = pd.concat([df_kmean_cluster_1_pivot, grand_total_df], ignore_index=True)
# Display the combined DataFrame
print('Cluster 1: Sector, Sub Industry and Company Count')
combined
Cluster 1: Sector, Sub Industry and Company Count
| GICS Sector | GICS Sub Industry | Company Count | |
|---|---|---|---|
| 0 | Consumer Discretionary | Casinos & Gaming | 1 |
| 1 | Consumer Discretionary | Restaurants | 1 |
| 2 | Consumer Staples | Tobacco | 1 |
| 3 | Energy | Oil & Gas Exploration & Production | 4 |
| 4 | Financials | Diversified Financial Services | 1 |
| 5 | Health Care | Health Care Equipment | 1 |
| 6 | Health Care | Pharmaceuticals | 1 |
| 7 | Industrials | Air Freight & Logistics | 1 |
| 8 | Industrials | Airlines | 1 |
| 9 | Industrials | Building Products | 1 |
| 10 | Industrials | Technology, Hardware, Software and Supplies | 1 |
| 11 | Materials | Copper | 1 |
| 12 | Materials | Specialty Chemicals | 1 |
| 13 | Grand Total | 16 |
This cluster consists of high-risk companies with significant losses and poor liquidity. Even though they show positive price movement, the company may have weak fundamentals. Investments should be limited and/or carefully researched for these companies.
Metrics
Companies
Allegion, Apache Corporation, Chesapeake Energy, Charter Communications, Colgate-Palmolive, Kimberly-Clark, S&P Global, Inc.
Sectors and Sub Industries
# @title
# Show pivot of sectors and subsectors with counts
df_kmean_cluster_2_pivot = df_kmean_cluster_2.pivot_table(
index=['GICS Sector', 'GICS Sub Industry'],
values='Security',
aggfunc={'Security': 'count'},
fill_value=0
).reset_index()
# Rename columns
df_kmean_cluster_2_pivot.columns = ['GICS Sector', 'GICS Sub Industry', 'Company Count']
# Calculate grand total
grand_total_count = df_kmean_cluster_2_pivot['Company Count'].sum()
grand_total_df = pd.DataFrame({
'GICS Sector': ['Grand Total'],
'GICS Sub Industry': [''],
'Company Count': [grand_total_count]
})
# Combine counts with grand total
combined = pd.concat([df_kmean_cluster_2_pivot, grand_total_df], ignore_index=True)
# Display the combined DataFrame
print('Cluster 2: Sector, Sub Industry and Company Count')
combined
Cluster 2: Sector, Sub Industry and Company Count
| GICS Sector | GICS Sub Industry | Company Count | |
|---|---|---|---|
| 0 | Consumer Discretionary | Cable & Satellite | 1 |
| 1 | Consumer Staples | Household Products | 2 |
| 2 | Energy | Integrated Oil & Gas | 1 |
| 3 | Energy | Oil & Gas Exploration & Production | 1 |
| 4 | Financials | Diversified Financial Services | 1 |
| 5 | Industrials | Building Products | 1 |
| 6 | Grand Total | 7 |
This cluster is by far the largest one. Companies falling into this cluster have good liquidity, stable performance, and are profitable. Investment comes with moderate risk but potential for growth.
Metrics
Companies
Abbott Laboratories, Adobe Systems Inc, Analog Devices, Inc., Archer-Daniels-Midland Co, Alliance Data Systems, Ameren Corp, American Electric Power, AFLAC Inc, American International Group, Inc., Apartment Investment & Mgmt, Assurant Inc, Arthur J. Gallagher & Co., Akamai Technologies Inc, Albemarle Corp, Alaska Air Group Inc, Allstate Corp, Alexion Pharmaceuticals, Applied Materials Inc, AMETEK Inc, Affiliated Managers Group Inc, Amgen Inc, Ameriprise Financial, American Tower Corp A, Amazon.com Inc, AutoNation Inc, Anthem Inc., Aon plc, Amphenol Corp, Arconic Inc, Activision Blizzard, AvalonBay Communities, Inc., Broadcom, American Water Works Company Inc, American Express Co, Boeing Company, Baxter International Inc., BB&T Corporation, Bard (C.R.) Inc., Baker Hughes Inc, BIOGEN IDEC Inc., The Bank of New York Mellon Corp., Ball Corp, Boston Scientific, BorgWarner, Boston Properties, Caterpillar Inc., Chubb Limited, CBRE Group, Crown Castle International Corp., Carnival Corp., Celgene Corp., CF Industries Holdings Inc, Citizens Financial Group, Church & Dwight, C. H. Robinson Worldwide, CIGNA Corp., Cincinnati Financial, Comerica Inc., CME Group Inc., Chipotle Mexican Grill, Cummins Inc., CMS Energy, Centene Corporation, CenterPoint Energy, Capital One Financial, Cabot Oil & Gas, The Cooper Companies, CSX Corp., CenturyLink Inc, Cognizant Technology Solutions, Citrix Systems, CVS Health, Chevron Corp., Concho Resources, Dominion Resources, Delta Air Lines, Du Pont (E.I.), Deere & Co., Discover Financial Services, Quest Diagnostics, Danaher Corp., The Walt Disney Company, Discovery Communications-A, Discovery Communications-C, Delphi Automotive, Digital Realty Trust, Dun & Bradstreet, Dover Corp., Dr Pepper Snapple Group, Duke Energy, DaVita Inc., Ecolab Inc., Consolidated Edison, Equifax Inc., Edison Int'l, Eastman Chemical, EOG Resources, Equinix, Equity Residential, EQT Corporation, Eversource Energy, Essex Property Trust, Inc., E*Trade, Eaton Corporation, Entergy Corp., Edwards Lifesciences, Exelon Corp., Expeditors Int'l, Expedia Inc., Extra Space Storage, Ford Motor, Fastenal Co, Facebook, Fortune Brands Home & Security, FirstEnergy Corp, Fidelity National Information Services, Fiserv Inc, FLIR Systems, Fluor Corp., Flowserve Corporation, FMC Corporation, Federal Realty Investment Trust, First Solar Inc, Frontier Communications, General Dynamics, General Growth Properties Inc., Gilead Sciences, Corning Inc., Genuine Parts, Garmin Ltd., Goodyear Tire & Rubber, Grainger (W.W.) Inc., Halliburton Co., Hasbro Inc., Huntington Bancshares, HCA Holdings, Welltower Inc., HCP Inc., Hess Corporation, Hartford Financial Svc.Gp., Harley-Davidson, Honeywell Int'l Inc., Hewlett Packard Enterprise, HP Inc., Hormel Foods Corp., Henry Schein, Host Hotels & Resorts, The Hershey Company, Humana Inc., International Business Machines, Intl Flavors & Fragrances, International Paper, Interpublic Group, Iron Mountain Incorporated, Intuitive Surgical Inc., Illinois Tool Works, Invesco Ltd., J. B. Hunt Transport Services, Jacobs Engineering Group, Juniper Networks, Kimco Realty, Kinder Morgan, Kansas City Southern, Leggett & Platt, Lennar Corp., Laboratory Corp. of America Holding, LKQ Corporation, L-3 Communications Holdings, Lilly (Eli) & Co., Lockheed Martin Corp., Alliant Energy Corp, Leucadia National Corp., Southwest Airlines, Level 3 Communications, LyondellBasell, Mastercard Inc., Mid-America Apartments, Macerich, Marriott Int'l., Mattel Inc., McDonald's Corp., Mondelez International, MetLife Inc., Mohawk Industries, Mead Johnson, McCormick & Co., Martin Marietta Materials, Marsh & McLennan, 3M Company, Monster Beverage, The Mosaic Company, Marathon Petroleum, Merck & Co., Marathon Oil Corp., M&T Bank Corp., Mettler Toledo, Murphy Oil, Mylan N.V., Navient, Noble Energy Inc, NASDAQ OMX Group, NextEra Energy, Newmont Mining Corp. (Hldg. Co.), Netflix Inc., Nielsen Holdings, National Oilwell Varco Inc., Norfolk Southern Corp., Northern Trust Corp., Nucor Corp., Newell Brands, Realty Income Corporation, ONEOK, Omnicom Group, O'Reilly Automotive, Occidental Petroleum, People's United Financial, PACCAR Inc., PG&E Corp., Priceline.com Inc, Public Serv. Enterprise Inc., PepsiCo Inc., Principal Financial Group, Procter & Gamble, Progressive Corp., Pulte Homes Inc., Philip Morris International, PNC Financial Services, Pentair Ltd., Pinnacle West Capital, PPG Industries, PPL Corp., Prudential Financial, Phillips 66, Quanta Services Inc., Praxair Inc., PayPal, Ryder System, Royal Caribbean Cruises Ltd, Regeneron, Robert Half International, Roper Industries, Range Resources Corp., Republic Services Inc, SCANA Corp, Charles Schwab Corporation, Spectra Energy Corp., Sealed Air, SL Green Realty, Scripps Networks Interactive Inc., Southern Co., Simon Property Group Inc, Stericycle Inc, Sempra Energy, SunTrust Banks, State Street Corp., Skyworks Solutions, Synchrony Financial, Stryker Corp., Molson Coors Brewing Company, Teradata Corp., Tegna, Inc., Torchmark Corp., Thermo Fisher Scientific, TripAdvisor, The Travelers Companies Inc., Tractor Supply Company, Tyson Foods, Tesoro Petroleum Co., Total System Services, Texas Instruments, Under Armour, United Continental Holdings, UDR Inc, Universal Health Services, Inc., United Health Group Inc., Unum Group, Union Pacific, United Technologies, Varian Medical Systems, Valero Energy, Vulcan Materials, Vornado Realty Trust, Verisk Analytics, Verisign Inc., Vertex Pharmaceuticals Inc, Ventas Inc, Waters Corporation, Wec Energy Group Inc, Whirlpool Corp., Waste Management Inc., Williams Cos., Western Union Co, Weyerhaeuser Corp., Wyndham Worldwide, Xcel Energy Inc, XL Capital, Dentsply Sirona, Xerox Corp., Xylem Inc., Yahoo Inc., Zimmer Biomet Holdings, Zions Bancorp, Zoetis
Sectors and Sub Industries
# @title
# Show all rows in the DataFrame
pd.set_option('display.max_rows', None)
# Show pivot of sectors and subsectors with counts
df_kmean_cluster_3_pivot = df_kmean_cluster_3.pivot_table(
index=['GICS Sector', 'GICS Sub Industry'],
values='Security',
aggfunc={'Security': 'count'},
fill_value=0
).reset_index()
# Rename columns
df_kmean_cluster_3_pivot.columns = ['GICS Sector', 'GICS Sub Industry', 'Company Count']
# Calculate grand total
grand_total_count = df_kmean_cluster_3_pivot['Company Count'].sum()
grand_total_df = pd.DataFrame({
'GICS Sector': ['Grand Total'],
'GICS Sub Industry': [''],
'Company Count': [grand_total_count]
})
# Combine counts with grand total
combined = pd.concat([df_kmean_cluster_3_pivot, grand_total_df], ignore_index=True)
# Display the combined DataFrame
print('Cluster 3: Sector, Sub Industry and Company Count')
combined
Cluster 3: Sector, Sub Industry and Company Count
| GICS Sector | GICS Sub Industry | Company Count | |
|---|---|---|---|
| 0 | Consumer Discretionary | Advertising | 2 |
| 1 | Consumer Discretionary | Apparel, Accessories & Luxury Goods | 1 |
| 2 | Consumer Discretionary | Auto Parts & Equipment | 2 |
| 3 | Consumer Discretionary | Automobile Manufacturers | 1 |
| 4 | Consumer Discretionary | Broadcasting & Cable TV | 2 |
| 5 | Consumer Discretionary | Cable & Satellite | 2 |
| 6 | Consumer Discretionary | Consumer Electronics | 1 |
| 7 | Consumer Discretionary | Distributors | 1 |
| 8 | Consumer Discretionary | Home Furnishings | 1 |
| 9 | Consumer Discretionary | Homebuilding | 2 |
| 10 | Consumer Discretionary | Hotels, Resorts & Cruise Lines | 4 |
| 11 | Consumer Discretionary | Household Appliances | 1 |
| 12 | Consumer Discretionary | Housewares & Specialties | 1 |
| 13 | Consumer Discretionary | Internet & Direct Marketing Retail | 4 |
| 14 | Consumer Discretionary | Leisure Products | 2 |
| 15 | Consumer Discretionary | Motorcycle Manufacturers | 1 |
| 16 | Consumer Discretionary | Publishing | 1 |
| 17 | Consumer Discretionary | Restaurants | 2 |
| 18 | Consumer Discretionary | Specialty Retail | 1 |
| 19 | Consumer Discretionary | Specialty Stores | 3 |
| 20 | Consumer Discretionary | Tires & Rubber | 1 |
| 21 | Consumer Staples | Agricultural Products | 1 |
| 22 | Consumer Staples | Brewers | 1 |
| 23 | Consumer Staples | Drug Retail | 1 |
| 24 | Consumer Staples | Household Products | 1 |
| 25 | Consumer Staples | Packaged Foods & Meats | 6 |
| 26 | Consumer Staples | Personal Products | 1 |
| 27 | Consumer Staples | Soft Drinks | 3 |
| 28 | Consumer Staples | Tobacco | 1 |
| 29 | Energy | Integrated Oil & Gas | 3 |
| 30 | Energy | Oil & Gas Equipment & Services | 3 |
| 31 | Energy | Oil & Gas Exploration & Production | 10 |
| 32 | Energy | Oil & Gas Refining & Marketing & Transportation | 6 |
| 33 | Financials | Asset Management & Custody Banks | 4 |
| 34 | Financials | Banks | 6 |
| 35 | Financials | Consumer Finance | 5 |
| 36 | Financials | Diversified Financial Services | 5 |
| 37 | Financials | Financial Exchanges & Data | 1 |
| 38 | Financials | Insurance Brokers | 3 |
| 39 | Financials | Investment Banking & Brokerage | 2 |
| 40 | Financials | Life & Health Insurance | 3 |
| 41 | Financials | Multi-Sector Holdings | 1 |
| 42 | Financials | Multi-line Insurance | 1 |
| 43 | Financials | Property & Casualty Insurance | 8 |
| 44 | Financials | Regional Banks | 3 |
| 45 | Financials | Thrifts & Mortgage Finance | 1 |
| 46 | Health Care | Biotechnology | 7 |
| 47 | Health Care | Health Care Distributors | 2 |
| 48 | Health Care | Health Care Equipment | 10 |
| 49 | Health Care | Health Care Facilities | 5 |
| 50 | Health Care | Health Care Supplies | 2 |
| 51 | Health Care | Life Sciences Tools & Services | 1 |
| 52 | Health Care | Managed Health Care | 5 |
| 53 | Health Care | Pharmaceuticals | 4 |
| 54 | Industrials | Aerospace & Defense | 4 |
| 55 | Industrials | Air Freight & Logistics | 2 |
| 56 | Industrials | Airlines | 4 |
| 57 | Industrials | Building Products | 2 |
| 58 | Industrials | Construction & Farm Machinery & Heavy Trucks | 3 |
| 59 | Industrials | Diversified Commercial Services | 1 |
| 60 | Industrials | Electrical Components & Equipment | 1 |
| 61 | Industrials | Environmental Services | 1 |
| 62 | Industrials | Human Resource & Employment Services | 1 |
| 63 | Industrials | Industrial Conglomerates | 14 |
| 64 | Industrials | Industrial Machinery | 5 |
| 65 | Industrials | Industrial Materials | 1 |
| 66 | Industrials | Railroads | 4 |
| 67 | Industrials | Research & Consulting Services | 4 |
| 68 | Industrials | Trucking | 1 |
| 69 | Information Technology | Application Software | 2 |
| 70 | Information Technology | Computer Hardware | 1 |
| 71 | Information Technology | Data Processing & Outsourced Services | 2 |
| 72 | Information Technology | Electronic Components | 2 |
| 73 | Information Technology | Electronic Equipment & Instruments | 1 |
| 74 | Information Technology | Home Entertainment Software | 1 |
| 75 | Information Technology | IT Consulting & Other Services | 3 |
| 76 | Information Technology | Internet Software & Services | 11 |
| 77 | Information Technology | Networking Equipment | 1 |
| 78 | Information Technology | Semiconductor Equipment | 1 |
| 79 | Information Technology | Semiconductors | 5 |
| 80 | Information Technology | Technology Hardware, Storage & Peripherals | 1 |
| 81 | Materials | Construction Materials | 2 |
| 82 | Materials | Diversified Chemicals | 5 |
| 83 | Materials | Fertilizers & Agricultural Chemicals | 2 |
| 84 | Materials | Gold | 1 |
| 85 | Materials | Industrial Gases | 1 |
| 86 | Materials | Metal & Glass Containers | 1 |
| 87 | Materials | Paper Packaging | 2 |
| 88 | Materials | Specialty Chemicals | 3 |
| 89 | Materials | Steel | 1 |
| 90 | Real Estate | Office REITs | 1 |
| 91 | Real Estate | REITs | 14 |
| 92 | Real Estate | Real Estate Services | 1 |
| 93 | Real Estate | Residential REITs | 4 |
| 94 | Real Estate | Retail REITs | 4 |
| 95 | Real Estate | Specialized REITs | 3 |
| 96 | Telecommunications Services | Alternative Carriers | 1 |
| 97 | Telecommunications Services | Integrated Telecommunications Services | 2 |
| 98 | Utilities | Electric Utilities | 12 |
| 99 | Utilities | MultiUtilities | 11 |
| 100 | Utilities | Water Utilities | 1 |
| 101 | Grand Total | 303 |
Only two companies fell this cluster, and both are likely good long-term investments. They appear to be in excellent financial health, have strong cash flow, and also solid earnings. Their relatively low P/E ratio suggest they are undervalued and could continue to grow.
Metrics
Companies
Bank of America Corp, Intel Corp.
Sectors and Sub Industries
# @title
# Show pivot of sectors and subsectors with counts
df_kmean_cluster_4_pivot = df_kmean_cluster_4.pivot_table(
index=['GICS Sector', 'GICS Sub Industry'],
values='Security',
aggfunc={'Security': 'count'},
fill_value=0
).reset_index()
# Rename columns
df_kmean_cluster_4_pivot.columns = ['GICS Sector', 'GICS Sub Industry', 'Company Count']
# Calculate grand total
grand_total_count = df_kmean_cluster_4_pivot['Company Count'].sum()
grand_total_df = pd.DataFrame({
'GICS Sector': ['Grand Total'],
'GICS Sub Industry': [''],
'Company Count': [grand_total_count]
})
# Combine counts with grand total
combined = pd.concat([df_kmean_cluster_4_pivot, grand_total_df], ignore_index=True)
# Display the combined DataFrame
print('Cluster 4: Sector, Sub Industry and Company Count')
combined
Cluster 4: Sector, Sub Industry and Company Count
| GICS Sector | GICS Sub Industry | Company Count | |
|---|---|---|---|
| 0 | Financials | Banks | 1 |
| 1 | Information Technology | Semiconductors | 1 |
| 2 | Grand Total | 2 |
Short-Term Investments
For investors looking for short-term profits (weeks or months), the focus will be on price momentum and volatility, as well as liquidity to handle short-term financial commitments.
Good:
Risky:
Longer-Term Investments
For investors who are willing to invest and wait (a year or longer),
Good:
Risky:
!pip install nbconvert
Requirement already satisfied: nbconvert in /usr/local/lib/python3.10/dist-packages (6.5.4) Requirement already satisfied: lxml in /usr/local/lib/python3.10/dist-packages (from nbconvert) (4.9.4) Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (4.12.3) Requirement already satisfied: bleach in /usr/local/lib/python3.10/dist-packages (from nbconvert) (6.1.0) Requirement already satisfied: defusedxml in /usr/local/lib/python3.10/dist-packages (from nbconvert) (0.7.1) Requirement already satisfied: entrypoints>=0.2.2 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (0.4) Requirement already satisfied: jinja2>=3.0 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (3.1.4) Requirement already satisfied: jupyter-core>=4.7 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (5.7.2) Requirement already satisfied: jupyterlab-pygments in /usr/local/lib/python3.10/dist-packages (from nbconvert) (0.3.0) Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (2.1.5) Requirement already satisfied: mistune<2,>=0.8.1 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (0.8.4) Requirement already satisfied: nbclient>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (0.10.0) Requirement already satisfied: nbformat>=5.1 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (5.10.4) Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from nbconvert) (24.1) Requirement already satisfied: pandocfilters>=1.4.1 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (1.5.1) Requirement already satisfied: pygments>=2.4.1 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (2.18.0) Requirement already satisfied: tinycss2 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (1.3.0) Requirement already satisfied: traitlets>=5.0 in /usr/local/lib/python3.10/dist-packages (from nbconvert) (5.7.1) Requirement already satisfied: platformdirs>=2.5 in /usr/local/lib/python3.10/dist-packages (from jupyter-core>=4.7->nbconvert) (4.3.6) Requirement already satisfied: jupyter-client>=6.1.12 in /usr/local/lib/python3.10/dist-packages (from nbclient>=0.5.0->nbconvert) (6.1.12) Requirement already satisfied: fastjsonschema>=2.15 in /usr/local/lib/python3.10/dist-packages (from nbformat>=5.1->nbconvert) (2.20.0) Requirement already satisfied: jsonschema>=2.6 in /usr/local/lib/python3.10/dist-packages (from nbformat>=5.1->nbconvert) (4.23.0) Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4->nbconvert) (2.6) Requirement already satisfied: six>=1.9.0 in /usr/local/lib/python3.10/dist-packages (from bleach->nbconvert) (1.16.0) Requirement already satisfied: webencodings in /usr/local/lib/python3.10/dist-packages (from bleach->nbconvert) (0.5.1) Requirement already satisfied: attrs>=22.2.0 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=2.6->nbformat>=5.1->nbconvert) (24.2.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=2.6->nbformat>=5.1->nbconvert) (2023.12.1) Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=2.6->nbformat>=5.1->nbconvert) (0.35.1) Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema>=2.6->nbformat>=5.1->nbconvert) (0.20.0) Requirement already satisfied: pyzmq>=13 in /usr/local/lib/python3.10/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (24.0.1) Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.10/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (2.8.2) Requirement already satisfied: tornado>=4.1 in /usr/local/lib/python3.10/dist-packages (from jupyter-client>=6.1.12->nbclient>=0.5.0->nbconvert) (6.3.3)
# Create an HTML version of this notebook - do this last
%%shell
jupyter nbconvert --to html '/content/drive/MyDrive/Learning/Data Coursework/PGP-DSBA/7-Unsupervised Learning/Project 7/USL_Project_LearnerNotebook_FullCode_Tuesday_Frase_vFinal.ipynb'
[NbConvertApp] Converting notebook /content/drive/MyDrive/Learning/Data Coursework/PGP-DSBA/7-Unsupervised Learning/Project 7/USL_Project_LearnerNotebook_FullCode_Tuesday_Frase_vFinal.ipynb to html [NbConvertApp] Writing 11319755 bytes to /content/drive/MyDrive/Learning/Data Coursework/PGP-DSBA/7-Unsupervised Learning/Project 7/USL_Project_LearnerNotebook_FullCode_Tuesday_Frase_vFinal.html